haskell-unordered-containers / unordered-containers

Efficient hashing-based container types
BSD 3-Clause "New" or "Revised" License
221 stars 99 forks source link

Add external, beginner friendly docs #173

Open m-renaud opened 6 years ago

m-renaud commented 6 years ago

Hey, I recently wrote up some external docs for the containers package which appears to be well received. I was wondering if you would accept a pull request for similar docs for this package. They would be the same format as the container docs and would live at haskell-unordered-containers.readthedocs.io.

Let me know what you think!

treeowl commented 6 years ago

I think "well-received" is a bit strong. Some have expressed strong support, some are very much opposed, and some (at least me) support the idea with significant reservations. I urge you to spend some more time working out how this idea can work well for one package before expanding to more. There is still much to be done!

On Jan 4, 2018 10:16 PM, "Matt Renaud" notifications@github.com wrote:

Hey, I recently wrote up some external docs for the containers package which appears to be well received https://www.reddit.com/r/haskell/comments/7nvjr2/rfc_a_beginner_friendly_introduction_to_haskell/. I was wondering if you would accept a pull request for similar docs for this package. They would be the same format as the container docs and would live at haskell-unordered-containers.readthedocs.io.

Let me know what you think!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tibbe/unordered-containers/issues/173, or mute the thread https://github.com/notifications/unsubscribe-auth/ABzi_Ru2boKlYCZBEjJoFQFABBy-igsZks5tHZQpgaJpZM4RT_FQ .

m-renaud commented 6 years ago

That's a very reasonable request. Could you elaborate more on what your reservations are? Is it that its not hosted on haskell.org or haskell-lang.org? That it uses ReST instead of haddocks? The styling being different than the other haskell resources?

I have a bunch of ideas around how the documentation story could be improved although I haven't had time to write them up in full yet. Eventually I would love to see every package with API docs (haddocks) at hackage.haskell.org/package/pkg-name, beginner friendly intros at docs.haskell.org/package/pkg-name, and tutorials at docs.haskell.org/tutorials/package/pkg-name.

The reasons I've decided to pursue this using ReadTheDocs because it has a really good "doc writer" experience, the layout is well organized, its easy to change themes (to make the theme align with other haskell docs), also lets you serve the docs from your own domain (making it possible to serve the same docs from docs.haskell.org).

Slightly off topic for here, but a lot of the content in the wikibook seems more suited for this layout as well, and moving it to ReST (or Markdown) and putting it in version control would means it could also be served alongside the other docs.

treeowl commented 6 years ago

ReST is not (yet) my favorite thing, but maybe I could learn to like it. My main experience with it has ben muddling through the GHC proposal process. The big points that are still open in my mind:

  1. I don't really like to duplicate the basic documentation for a function because that increases the maintenance burden. Is there a way to make snippets of the Haddock documentation appear in the tutorial documentation semi-automatically? That is, in the section of the tutorial describing a function, can you magic in the relevant Haddocks somehow? I don't know much about this web stuff, so maybe it's impossible, but that seems pretty ugly.

  2. I would want there to be links both from the tutorial to the Haddocks and (when relevant) from the Haddocks to the tutorial. Links from the Haddocks to the tutorial would be relevant when the tutorial includes in-depth information or more examples than the Haddocks. That said, we should always be asking "Should this information be added directly to the Haddocks?"

  3. With links come complications. We'd ideally want to support all of the following:

    • Compile the tutorial and Haddocs for local use; make the links to each from the other point to the local copy.

    • Compile the tutorial for upload to relevant sites. Mostly, we're interested in ReadTheDocs, Hackage, and Stackage, but someone might want network-wide copies. These should be paired properly. Among other things, that probably means one set of ReadTheDocs tied to Hackage and another to Stackage. The user experience will be crummy if they follow a link from Stackage to ReadTheDocs and another "back" to Hackage. Alternatively, might there be some more dynamic way? I don't know much about the web stuff, but it might be interesting if following a link from one would "fill in" return links, allowing pairing on the fly.

    • Compile the Haddocks for local use, but link them to ReadTheDocs.

    • Compile the tutorial for local use, but link it to Hackage or Stackage.

m-renaud commented 6 years ago
  1. I don't really like to duplicate the basic documentation for a function because that increases the maintenance burden.

In general I tend to agree with you, if we were verbatim copy/pasting the haddocks into here without adding any additional content or explanations for the majority of the functions then I think we would be doing something wrong. That being said, in the case of intro material, the functions presented are going to be those of the core API which are unlikely to change very often.

I looked into how to pull in snippets but unfortunately I wasn't able to find anything promising. In any case, I'm not sure that we would want verbatim anyways, I somewhat expect the description of the functions to vary between intro docs and API docs. For example, in intro docs there probably shouldn't be too much mention of strictness or complexity unless its highly relevant to the operation (for example, calling out that nub is O(n^2) so you probably don't want to use it). In the API docs on the other hand, these are very important things to call out and discuss.

What I would love to see is support in Haddocks for breaking function docs down into summary, longer description, complexity, strictness, laws, examples, and for Haddock to generate links/snippets of each of these components so they could be referenced in other places (like here). Then the "summary" part would correspond to what an absolute beginner could read and understand what the function does. In the introduction we would probably want to expand on that some more, but we could start with a verbatim include of that to start with, then you wouldn't have as much of an issue with drift.

  1. I would want there to be links both from the tutorial to the Haddocks and (when relevant) from the Haddocks to the tutorial.

Agreed (although the implementation of this ties into your 3rd concern which I'll address more directly below), at a minimum the package description and readme from the package should link to the beginner friendly introduction.

I'm not as convinced that linking from an individual function in the haddocks to the introduction would be useful because: a) if its a simple/core function, the haddocks should have enough examples if you find yourself looking there. b) if its a more obscure function, it shouldn't be in the introduction :)

Now, there's the idea of a "tutorial" which I think is the 3rd type of documentation that you want for a package/library, and that is where there would likely be long form examples using the more obscure functions from the API. This is where I think it would be useful to link out to from the Haddocks, but I think that's out of scope of the introductory docs, so I'd like to punt on that specific case for now.

  1. With links come complications.

Ah yes, you're definitely correct here :P I will say that this is an existing issue though, there are tonnes of places where Haskell docs live (Haskell wiki, Haskell Wikibook, School of Haskell, many more) that have links that only point to hackage (not stackage or wherever else the reader was linked from). So, this is a bigger issue that is hard to solve without all docs consistently being hosted in the same place using the same URI scheme.

Fortunately it's relatively (requires a bit of code to be written once) straightforward to customize RST to choose between haddock destinations when the docs are being built. For example, if you use roles to link to docs you can have the configuration specify how to build the link (with a little bit of code). Say for instance you wanted to link to the binary package, you could use the following in the docs: ":pkg:binary" and based on a parameter to the make html command would generate either:

based on what the value of HADDOCK_HOST is. I have a local change that does this (although I need to fix a few things since hackage and stackage use different uri schemes).

We could provide a set of primitives that Do The Right Thing (TM) for all the types of haddock links you may want: package root, specific module in a package, specific function within a module. This can then be published as an extension that other people wanting to write similar docs can simply add to their config: extensions = [haddock_links_extension].

I'm pretty sure you could even have ReadTheDocs build all the versions with the appropriate links, and then link to the correct "version" from hackage or stackage. I looked a bit and unfortunately there isn't an easy way to generate the links on the fly since reST is mainly focused on static content generation. That being said, if you really wanted to you could do this with JS though by having the link generators above call out to a function which reads a uri param and does the link generation (then you could have a URL like: haskell-containers.readthedocs.io/en/latest/?haddock_host=hackage where all haddock links would point to hackage). I'm confident I could build that, but I don't think its a blocker either, if we decided to do that in the future we would just need to change the link generators and everything would just switch over.

Going the other way isn't impossible either, you would just need to have a variable that haddock knows how to deal with to select the correct version of the introduction to link to throughout.

So, in summary, I think all of these are solvable problems with solutions that aren't completely hideous. Obviously this would all be easier if there was only one place where docs were hosted (I don't care where), but that's not the reality so we have to work with what we have :)

treeowl commented 6 years ago

Even with only one canonical place for documentation, there will always be people who want to build it locally so they can use it without Internet access. There are probably also people who want to build it locally so they can customize it.

On Jan 5, 2018 12:40 PM, "Matt Renaud" notifications@github.com wrote:

  1. I don't really like to duplicate the basic documentation for a function because that increases the maintenance burden.

In general I tend to agree with you, if we were verbatim copy/pasting the haddocks into here without adding any additional content or explanations for the majority of the functions then I think we would be doing something wrong. That being said, in the case of intro material, the functions presented are going to be those of the core API which are unlikely to change very often.

I looked into how to pull in snippets but unfortunately I wasn't able to find anything promising. In any case, I'm not sure that we would want verbatim anyways, I somewhat expect the description of the functions to vary between intro docs and API docs. For example, in intro docs there probably shouldn't be too much mention of strictness or complexity unless its highly relevant to the operation (for example, calling out that nub is O(n^2) so you probably don't want to use it). In the API docs on the other hand, these are very important things to call out and discuss.

What I would love to see is support in Haddocks for breaking function docs down into summary, longer description, complexity, strictness, laws, examples, and for Haddock to generate links/snippets of each of these components so they could be referenced in other places (like here). Then the "summary" part would correspond to what an absolute beginner could read and understand what the function does. In the introduction we would probably want to expand on that some more, but we could start with a verbatim include of that to start with, then you wouldn't have as much of an issue with drift.

  1. I would want there to be links both from the tutorial to the Haddocks and (when relevant) from the Haddocks to the tutorial.

Agreed (although the implementation of this ties into your 3rd concern which I'll address more directly below), at a minimum the package description and readme from the package should link to the beginner friendly introduction.

I'm not as convinced that linking from an individual function in the haddocks to the introduction would be useful because: a) if its a simple/core function, the haddocks should have enough examples if you find yourself looking there. b) if its a more obscure function, it shouldn't be in the introduction :)

Now, there's the idea of a "tutorial" which I think is the 3rd type of documentation that you want for a package/library, and that is where there would likely be long form examples using the more obscure functions from the API. This is where I think it would be useful to link out to from the Haddocks, but I think that's out of scope of the introductory docs, so I'd like to punt on that specific case for now.

  1. With links come complications.

Ah yes, you're definitely correct here :P I will say that this is an existing issue though, there are tonnes of places where Haskell docs live (Haskell wiki, Haskell Wikibook, School of Haskell, many more) that have links that only point to hackage (not stackage or wherever else the reader was linked from). So, this is a bigger issue that is hard to solve without all docs consistently being hosted in the same place using the same URI scheme.

Fortunately it's relatively (requires a bit of code to be written once) straightforward to customize RST to choose between haddock destinations when the docs are being built. For example, if you use roles to link to docs you can have the configuration specify how to build the link (with a little bit of code). Say for instance you wanted to link to the binary package, you could use the following in the docs: ":pkg:binary" and based on a parameter to the make html command would generate either:

based on what the value of HADDOCK_HOST is. I have a local change that does this (although I need to fix a few things since hackage and stackage use different uri schemes).

We could provide a set of primitives that Do The Right Thing (TM) for all the types of haddock links you may want: package root, specific module in a package, specific function within a module. This can then be published as an extension that other people wanting to write similar docs can simply add to their config: extensions = [haddock_links_extension].

I'm pretty sure you could even have ReadTheDocs build all the versions with the appropriate links, and then link to the correct "version" from hackage or stackage. I looked a bit and unfortunately there isn't an easy way to generate the links on the fly since reST is mainly focused on static content generation. That being said, if you really wanted to you could do this with JS though by having the link generators above call out to a function which reads a uri param and does the link generation (then you could have a URL like: haskell-containers.readthedocs.io/en/latest/? haddock_host=hackage where all haddock links would point to hackage). I'm confident I could build that, but I don't think its a blocker either, if we decided to do that in the future we would just need to change the link generators and everything would just switch over.

Going the other way isn't impossible either, you would just need to have a variable that haddock knows how to deal with to select the correct version of the introduction to link to throughout.

So, in summary, I think all of these are solvable problems with solutions that aren't completely hideous. Obviously this would all be easier if there was only one place where docs were hosted (I don't care where), but that's not the reality so we have to work with what we have :)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tibbe/unordered-containers/issues/173#issuecomment-355617264, or mute the thread https://github.com/notifications/unsubscribe-auth/ABzi_VlYMoYpYjKlDz-pqJpxR_qnXZqOks5tHl6ZgaJpZM4RT_FQ .

m-renaud commented 6 years ago

there will always be people who want to build it locally so they can use it without Internet access

Absolutely, I already have a local copy that supports pointing to local haddocks :) All you need to do is run:

make -e HADDOCK_HOST=local HADDOCK_DIR=file:///path/to/haddocks/ html
m-renaud commented 6 years ago

Thoughts on revisiting this?

m-renaud commented 6 years ago

Friendly ping :)

Another data point for having this type of tutorial, from https://pl-rants.net/posts/libraries-vs/

Not all hope is lost in Haskell land. The excellent containers package has very good introduction which is enough for probably most use cases one can imagine. It was unobtrusive and took only 10-15 minutes to code everything I needed.

If they decided to use unordered-containers they possibly would have never found the introduction.

The work to write this for unordered-containers would be pretty minimal since it shares a lot of functions with the containers package.

m-renaud commented 4 years ago

It's been over a year and a half now and this package still doesn't have any introductory tutorial. I don't see any harm by adding some docs to fill this hole. Can we please not let perfect be the enemy of the good here? I've done my best to address your concerns in my comments above, please let me know if you have any major objections to continuing on with this. Thanks!

treeowl commented 4 years ago

Okay, please feel free to proceed. And may the force be with you!

sjakobi commented 4 years ago

Now that #267 has been merged, we have beginner-friendly documentation in Data.HashSet! :)