ocaml / odoc

Documentation compiler for OCaml and Reason
Other
315 stars 87 forks source link

Extract all examples and run them (e.g., in CI) #130

Closed aantron closed 4 years ago

aantron commented 6 years ago

This would ensure that they remain up-to-date, and therefore useful to readers.

https://discuss.ocaml.org/t/1841/42, https://discuss.ocaml.org/t/1841/43. cc @grayswandyr @trefis.

dbuenzli commented 6 years ago

I thought there was already an issue and a design for this but I can't find it anyway I'll propose here what I had in mind. It's quite an important issue since for now this is absolutely not DRY, I usually have a copy of the examples in my packages that I compile but as you guess this has all the chances of becoming outdated and is painful to maintain by hand. It will also be very useful for tutorial style .mld documents.

The goal of the proposal is to allow:

  1. Multiple examples in a single .mld/compilation unit. See e.g. the Cmdliner examples's examples.
  2. Literate programming style that interleaves code and prose. See e.g. Vg's basics.

The idea is that verbatim/code fences should allow the specification of an identifier (don't know exactly how though, people into the parsing of ocamldoc comments certainly know better). The extracted file corresponding to an identifier is simply the concatenation of all the fences that bear the identifier .

{id1[let x = ...]}
blabla
{id1[let y = ...]}
{id2[ other example ]}
{id1[ still belongs to id1]}

Now odoc is extended as follows:

odoc extract --id-list CMTI # print the ids of CMTI one by line
odoc extract --id ID -o FILE CMTI   # extract the contents for id ID in CMTI to FILE
dbuenzli commented 6 years ago

Btw. I don't know if this is possible due to the semantics of stop comments but allowing:

(**/**)
(** {1id[This will be in id1]} *)
(**/**)

would be useful aswell.

rizo commented 6 years ago

@dbuenzli Can you elaborate on how the code for particular identifiers would be executed after the extraction? I assume you would feed it into the compiler after that somehow, but what tool would do that? Also what is the need to have the identifiers explicitly in the code blocks?

FYI, there's an ongoing attempt to add support for "literate programming" in dune with help from mdx. I am interested in trying to use mdx with mli and mld files for a potential integration with odoc (See https://github.com/samoht/mdx/issues/17).

rizo commented 6 years ago

@trefis I also noticed your comment related to this where you said:

The idea has been floating around for a while now of having odoc understand toplevel expect test files, i.e. docstring would be parsed in the usual way, and all the code (and expect) blocks would be wrapped in {[ ]} and left untouched.

Could you explain how this could be achieved? In particular, how would the syntax for expect blocks look in docstrings?

lpw25 commented 6 years ago

The way I'd been picturing toplevel expect test support was to take a file like:

(** You can make an array.

     Here is an array of ten elements. *)
Array.make 10 0
[%expect{|
- : int array = [|0; 0; 0; 0; 0; 0; 0; 0; 0; 0|]
|}]

and turn it into something like:


You can make an array.

Here is an array of ten elements:

Array.make 10 0
- : int array = [|0; 0; 0; 0; 0; 0; 0; 0; 0; 0|]

One way to implement it would be to give the toplevel expect test tool a mode like -bin-annot to generate a file like a .cmt file and then add support in odoc for processing these files.

lpw25 commented 6 years ago

Probably worth noting that a benefit of this approach using a tool like toplevel expect tests is that odoc doesn't need to know how to compile the code. Otherwise you need to have odoc take all the command-line flags of the OCaml toplevel and handle setting up things like the load path.

dbuenzli commented 6 years ago

@dbuenzli Can you elaborate on how the code for particular identifiers would be executed after the extraction?

That's precisely none of your business to know. Just let me extract the contents of an identifier to a file with or without line number directives. KISS.

Also what is the need to have the identifiers explicitly in the code blocks?

So that you can extract specific parts to different files. If you take cmdliner's examples there's more than one executable being defined there, to be extracted to different files.

rizo commented 6 years ago

@lpw25 If I understood correctly the example file you're describing is a ml file. It's an interesting approach, but I think what was being discussed here originally is targeted at mli and mld files. Runnable examples in mld files are valuable for literate programming in particular, where comments are written at the top-level and code is introduced with the code block ({[...]}) syntax.

We'd have to look at some high-level use cases to decide which approach is more appropriate. Given that examples are usually included in docstrings for signature items (i.e. mli files), I'm tempted to say it's more useful to run those.

Probably worth noting that a benefit of this approach using a tool like toplevel expect tests is that odoc doesn't need to know how to compile the code.

I agree. Regardless of the approach we pick, I think odoc shouldn't be responsible for compiling and executing the code.

rizo commented 6 years ago

That's precisely none of your business to know.

@dbuenzli I will assume you meant that it's non of odoc's business, alright? :-)

Maybe I didn't explain myself correctly: I'm not suggesting to teach odoc how to run code examples. What matters to me is the full user-experience and integration for this feature. Implementing extraction of code blocks with identifiers into files is (probably) easy. I'm not interested in that alone. Ideally I'd like to make sure that dune (or some other tool) can pick up those files and run them without too much trouble for the users.

From your original comment, it seems like you already know what to do with the extracted code blocks and that's why I asked.

lpw25 commented 6 years ago

If I understood correctly the example file you're describing is a ml file.

Well it is an "mlt" file. These are the files supported by the toplevel expect test tools. Here's an example in base.

dbuenzli commented 6 years ago

@dbuenzli I will assume you meant that it's non of odoc's business, alright? :-)

Yes.

I'm not interested in that alone. Ideally I'd like to make sure that dune (or some other tool) can pick up those files and run them without too much trouble for the users.

Sure but that's out of scope and I don't think there's any point in discussing this here. The only thing we need is a build system friendly CLI and I'd really like the extraction to be agnostic to the file type, i.e. that it doesn't not assume that this ocaml signatures and/or implementations (so that you can also e.g. write json data in verbatim/code fences and extract it to a json file).

rizo commented 6 years ago

@lpw25 Oh, I missed the fact that toplevel_expect_test introduces the mlt extension.

I see it as a more convenient way to write expect test. Generating docs based on those tests would be awesome, but I think it's a different feature. Do you agree?

Of course this doesn't invalidate the possibility of using toplevel_expect_test as a mechanism for running the examples extracted by odoc.

lpw25 commented 6 years ago

I think it's a different feature. Do you agree?

I do, but I think it is also a much easier feature to implement and would be a reasonable stand-in for a lot of cases until the other features are available. It is a pretty reasonable way to write simple tutorials with executed code samples -- and that covers a not insubstantial set of use cases -- hence why I brought it up here.

rizo commented 6 years ago

Sure but that's out of scope and I don't think there's any point in discussing this here.

Is there a better place to discuss this?

The only thing we need is a build system friendly CLI

I understand your point. If the interface is well-defined and flexible, other tools can just figure out how to do the rest. In my opinion these kind of features can also benefit from top-down design for specific use-cases, because they require integration with multiple tools. Implementing isolated building blocks might lead to fragmented user experience.

I'd really like the extraction to be agnostic to the file type

Sure! I don't see why it would need to be file type specific. The identifiers could be used to define the type of the content. These annotations could potentially be useful for Reason support too.

rizo commented 6 years ago

@lpw25 It seems like what you describe can be implemented separately from odoc (maybe with help of dune?). I'll try to learn more about toplevel_expect_test when I have time.

lpw25 commented 6 years ago

How do you mean? Some kind of translation to an .mld file?

rizo commented 6 years ago

Yes, potentially. Unless you think having support for cmt files would be more useful.

lpw25 commented 6 years ago

I hadn't thought of producing an .mld file but I think it's just as good as what I was suggesting, so I agree this doesn't need any odoc support.

github-actions[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.