ocaml / v2.ocaml.org

Implementation of the ocaml.org website.
http://ocaml.org
Other
323 stars 349 forks source link

Mirror the manual on the web site? #1011

Open pmetzger opened 6 years ago

pmetzger commented 6 years ago

The only time I ever go to caml.inria.fr is to look at the manual online. I know there's a thought of eventually having "docs.ocaml.org" or some such, but in the meanwhile, would it be a problem to mirror the manual from http://caml.inria.fr/pub/docs/manual-ocaml/ to some URL on ocaml.org? I would submit a PR, but the build system for the web site is a bit opaque.

Chris00 commented 6 years ago

The problem is not the technical details, it is to have the agreement of Inria because we were specifically asked not to mirror files in pub/.

Chris00 commented 6 years ago

Maybe one thing we could do is to put in place some "proxying" of the URLs so they seem to be located on o.o while actually being hosted on Inria servers. @xavierleroy or @damiendoligez may say whether this is a satisfactory solution for them.

pmetzger commented 6 years ago

Isn't the manual redistributable?

agarwal commented 6 years ago

My understanding was that the Inria team wanted to keep organizing files under pub/, i.e. they didn't want their own workflow altered. I don't believe there is any opposition to mirroring content elsewhere if someone else does that work.

pmetzger commented 6 years ago

We could also just build the 4.07 manual ourselves. It's straightforward.

xavierleroy commented 6 years ago

The problem is not the technical details, it is to have the agreement of Inria because we were specifically asked not to mirror files in pub/.

I forgot this discussion. At any rate things have changed since the early design days of ocaml.org: for example, the preferred URL for source distributions is now https://github.com/ocaml/ocaml/releases/ instead of http://caml.inria.fr/pub/, because the former is more reliable. More generally, I have concerns about the long-term availability of caml.inria.fr/

So, I have no problem with hosting copies of the OCaml manuals on ocaml.org, and it can only increase availability of the material. Bonus points if you can have the manual automatically generated from source distributions or even from Github branches.

pmetzger commented 6 years ago

Bonus points if you can have the manual automatically generated from source distributions or even from Github branches.

This should be quite doable. Scripting git is quite easy, and the github API is easy to use, too. I could work on it if I had some sense of how to run things that generate HTML for ocaml.org.

pmetzger commented 6 years ago

BTW, if someone who can explain to me a bit more about the principles of how the ocaml.org build system is organized would get in touch with me, I'll do a pull request to do this.

agarwal commented 6 years ago

@pmetzger Might be easier to exchange some emails about this. I'll get in touch.

Chris00 commented 5 years ago

Any news on this?

pmetzger commented 5 years ago

Nope, I dropped it on the floor. I'm a bit busy for the next couple of weeks, so if someone else wants to do it meanwhile I won't object. Otherwise keep poking me.

sanette commented 4 years ago

I have started working on this. I need help on what the rules are for integrating new pages

sanette commented 4 years ago

Next question, is where do we store the manual on ocaml.org? A possibility would be to modify the https://ocaml.org/docs/ so that the first 4 "blocks" would be links to the 4 parts of the manual. If you agree with this, I can try to propose something

pmetzger commented 4 years ago

I'd suggest https://ocaml.org/docs/manual-ocaml/ as a URL, and have everything on https://ocaml.org/docs/ that points currently to the manual at caml.inria.fr point instead there. But this does mean that we need an easy procedure for updating the manual when OCaml gets updated!

dbuenzli commented 4 years ago

I'd suggest https://ocaml.org/docs/manual-ocaml/ as a URL,

I know manual-ocaml is what is on caml.inria.fr but I have a slight gripe with manual-ocaml in that the opam package for the manual is called ocaml-manual (my fault).

What about simply https://ocaml.org/manual, that's easy to type and remember.

sanette commented 4 years ago

But this does mean that we need an easy procedure for updating the manual when OCaml gets updated!

yes of course. At this point I have a script that does everything but only for Part 1 of the manual, and I have only tested it for the latest version of ocaml. What I could do when I get the time is to test it against all available versions of ocaml to see if this is robust.

Chris00 commented 4 years ago

As the URL, I suggest https://ocaml.org/docs/manual as a link pointing to the most recent manual and https://ocaml.org/docs/manual/⟨version⟩ for the actual manual for OCaml ⟨version⟩. So it would be ideal for the script to be passed a ⟨version⟩ and build the manual from the corresponding Git branch. The main Makefile already has a variable for the most recent OCaml version so can adequately drive this script.

pmetzger commented 4 years ago

@dbuenzli suggests:

What about simply https://ocaml.org/manual, that's easy to type and remember.

Seems fine by me, as does @Chris00's suggestion of https://ocaml.org/docs/manual — @dbuenzli's suggestion has the advantage of being slightly more compact. I agree that the manual/[version] idea would also be a good idea to allow access to a specific version.

dbuenzli commented 4 years ago

I'd prefer /manual. That way I don't need a search engine or try to do remember whether it's doc or docs. +1 for versions.

Chris00 commented 4 years ago

I do not have strong opinions about /docs/manual or /manual. For docs v.s. doc, maybe we should add a rewrite rule doc → docs (on the server) so it is not an issue?

dbuenzli commented 4 years ago

For docs v.s. doc, maybe we should add a rewrite rule doc → docs (on the server) so it is not an issue?

I don't think that's needed if I don't need to remember it or type it manually.

pmetzger commented 4 years ago

I'd say the consensus is now https://ocaml.org/manual for the latest, and https://ocaml.org/manual/[version] for a particular version of the manual.

sanette commented 4 years ago

The script is now able to process versions 4.00 to 4.09. See: https://sanette.github.io/ocaml-tutorial/

agarwal commented 4 years ago

I prefer /manual over /docs/manual. This follows the principle of "flat is better than nested". Unless we anticipate a name clash, but I think "manual" uniquely refers to the single document we are talking about here.

For docs v.s. doc, maybe we should add a rewrite rule doc → docs (on the server) so it is not an issue?

I think that's not worth it. This introduces two names for the same thing, which I think is generally not a good idea.

sanette commented 4 years ago

Is there a need to include versions older than 4.00?

pmetzger commented 4 years ago

@sanette I don't think so. Indeed, I think we can probably cut it off at 4.02 but you've already gone earlier than that.

sanette commented 4 years ago

Hi, I need some help... I'm generating the ocaml.org website locally, and trying to add the manual pages. This is driving me nuts. Here is an exemple: I modify ocaml.org/site/docs/index.md to

<!-- ((! set title Docs !)) ((! set documentation !)) ((! set nobreadcrumb !)) -->

<h1>Documentation</h1>
<pre><div class="hello">Hello</div></pre>
Bye

and then run make, and here is the generated code:

<h1>Documentation</h1>
<pre></pre><div class="hello">Hello</div>

<p>Bye</p>

Notice how the <pre> has been closed prematurely. What should I do??

pmetzger commented 4 years ago

That's a new one on me. I don't know the answer. I think you're going to have to debug the generation toolchain. :(

sanette commented 4 years ago

And I don't understand the logic of using markdown files to include HTML. This is very dangerous. For instance in the example above, if you inadvertently add 4 spaces before <h1> then it is recognized as code and it generates:

<pre><code>&lt;h1&gt;Documentation&lt;/h1&gt;</code></pre>
pmetzger commented 4 years ago

I'm not a giant fan of the way that the web site is set up (it's too complicated, IMHO) but fixing that would need to be a separate PR. :)

sanette commented 4 years ago

That's a new one on me. I don't know the answer. I think you're going to have to debug the generation toolchain. :(

unfortunately, this is a blocker for me... because the ocaml manual uses <pre><div ...> for essentially all code snippets... :(

sanette commented 4 years ago

Screenshot_20191109_203156

Here is my current experimentation.

You can see at the very bottom that ocaml code is not correctly recognized by the CSS, because of this <pre> issue.

Chris00 commented 4 years ago

I do not have time before Monday to look at the particular markdown problem you mention but you can output HTML files — they are also accepted.

sanette commented 4 years ago

@Chris00 , the problem is the same with HTML files, apparently they get modified.

Chris00 commented 4 years ago

Not much, see Makefile.from_html. This also explains why most HTML code is in markdown files: to use template/main.mpp to “surround” the content.

sanette commented 4 years ago

What I meant is that the particular problem with <pre> happens also with html files. If I write this index.html:

<h1>Documentation</h1>
<pre><div class="hello">Hello</div></pre>
Bye

then the generated index.html is

<!DOCTYPE HTML>

<h1>Documentation</h1>
<pre></pre><div class="hello">Hello</div>
Bye
sanette commented 4 years ago

I'd prefer /manual. That way I don't need a search engine or try to do remember whether it's doc or docs. +1 for versions.

Just a silly remark, but I lost one hour because of this: it turns out my ubuntu has a default option that aliases /manual to the apache manual... 8-)

sanette commented 4 years ago

I have narrowed the problem down. At the end of the make process, the script/relative_urls is applied to all files (which is a bit brutal, by the way). It is this script that scrambles the <pre> tag away...

You can try it directly

script/relative_urls foo.html

where foo.html contains a <pre><div...> a above

sanette commented 4 years ago

So apparently the problem comes from the Nethtml module. Which library is this?

Chris00 commented 4 years ago

Yes, I came to the same conclusion. The script script/relative_urls.ml parses HTML for the local rendering so that links can be browsed on the disk. The HTML is parsed with Nethtml:

# #require "netstring";;
# let html = 
  let s = {|<h1>Documentation</h1>
<pre><div class="hello">Hello</div></pre>
Bye|} in
  Nethtml.parse_document (Lexing.from_string s);;
val html : Nethtml.document list =
  [Nethtml.Element ("h1", [], [Nethtml.Data "Documentation"]);
   Nethtml.Data "\n"; Nethtml.Element ("pre", [], []);
   Nethtml.Element ("div", [("class", "hello")], [Nethtml.Data "Hello"]);
   Nethtml.Data "\nBye"]

I believe Nethtml rewrites the HTML that way because <div> tags are not allowed inside <pre>.

sanette commented 4 years ago

<div> tags are not allowed inside <pre>

==> in this case we have to tell this to hevea. Since version 4.05, this is the way Hevea outputs ocaml code... :(

Chris00 commented 4 years ago

Issue opened. This is not hevea's fault.

sanette commented 4 years ago

thanks

sanette commented 4 years ago

In the meantime I'm asking lambdasoup to replace these divs by code, so I can continue working on the manual.

sanette commented 4 years ago

Following our previous discussions, here is what I propose for the /docs page. Screenshot_20191116_174911

I will do a PR so that you can test and try by yourselves. But if you already have some comments, I can try to implement them. Of course it's not finished! At this point only the tutorial part links to the local version of the manual. The other parts link to pages on the Inria website. The 'tools' part is still missing, I will work on it when I can.

sanette commented 4 years ago

I'm not very satisfied with the distinction between "Core Library" and "Standard Library". Maybe we should put both in the same block. Btw in the manual itself it's not so clear: sometimes the Core Library (Stdlib) is called Standard Library, and sometimes all other modules are also called Standard Library.

sanette commented 4 years ago

I think this one is better: Screenshot_20191117_122256

sanette commented 4 years ago

you can check it out from my fork:

https://github.com/sanette/ocaml.org

agarwal commented 4 years ago

the script/relative_urls is applied to all files (which is a bit brutal, by the way).

We should simply tell developers to run a dev server to view the site locally. A trivial python -m SimpleHTTPServer is all that is needed, and we could delete the URL munging.

it turns out my ubuntu has a default option that aliases /manual to the apache manual

This would also be resolved.

I'm not very satisfied with the distinction between "Core Library" and "Standard Library".

The library that ships with OCaml is called the "Standard Library". The "Core Library" is not the right name for it. That is an established name, and it refers to Jane Street's Core suite of libraries. I know the manual has a section called the "The core library". The term isn't actually important to the library distributed with the compiler, and I personally think we should do away with the term "core" in that context. All of it is packaged in a module called Stdlib and we can call it all "The Standard Library".

xavierleroy commented 4 years ago

For what it's worth:

More generally: I would really like to see this effort (of hosting the OCaml manual on ocaml.org) reach completion. The future of the caml.inria.fr Web server is still unclear, and the manual deserves a more stable home.