JuliaComputing / JuliaHub-Feedback

Public repo for filing JuliaHub issues
6 stars 1 forks source link

Documentation Re-hosting Issues and Process for 2024 #161

Open Dattax opened 8 months ago

Dattax commented 8 months ago

Summary:

JuliaHub.com builds and hosts package documentation, and we’ve been attempting to do that for every open source package in the General registry (unless opted out). The hope is to have useful documentation available on docs.juliahub.com for every package with a unified search and discovery experience that everyone can enjoy. But our execution hasn’t been perfect — we’ve heard feedback from package developers and others on multiple issues and are dedicating time and effort to improve things.

There are three concrete things we are aiming to improve upon:

1. Better detection of default hosting locations and alternate build systems

Currently, depending on the package, the documentation that is on juliahub.com is generated by:

But this isn’t great because the package authors (especially for more mature packages) have spent a lot of effort curating and building their self-hosted documentation. Paradoxically, the more thorough and therefore complex the documentation is, the more likely it is to fail to build automatically — and requiring a manual opt-out process is tedious and frustrating.

To improve the situation here, we are planning to make JuliaHub smarter about inferring whether the package has self-hosted docs or not. We are planning to trial the following heuristics:

We would also like to push the ecosystem to adopt standard Project.toml metadata, which would include a way to specify where external tools (including, but not limited to JuliaHub; editor tooling can also benefit from that kind of metadata) can find the documentation (https://github.com/JuliaLang/Pkg.jl/issues/1070).

We will not attempt to auto-generate or host documentation of packages for which the above heuristics succeed.

2. Mitigating “broken” docs

When documentation generation has failed due to one reason or another, JuliaHub serves a fallback with the README & docstrings. We hope that getting better at identifying external hosting will prevent many cases of broken docs in the first place, but in the event that we fail, we want to make what’s happening clearer to both package devs and users.

We are planning on adding a top bar to indicate to the docs.juliahub.com visitor that they are reading automatically generated (fallback) documentation, and that they should also check the package’s GitHub repository to see if there are other docs available.

3. Avoiding old and out-dated docs getting prioritized rankings in search engines

We hope this one is largely resolved! We have pushed out updates over the past few months that should force search engines to de-index old versions. Please report this to us if this is still an issue you encounter.

odow commented 8 months ago

Thanks for writing this up. Agree with trying to redirect where possible but agree it's a tricky heuristic to get right.

We are planning on adding a top bar to indicate to the docs.juliahub.com visitor that they are reading automatically generated (fallback) documentation

I think this has to be the top priority. Here are two examples from projects I'm involved in. There's nothing to indicate that these pages are a product of JuliaHub and not the first-party authors.

https://docs.juliahub.com/General/SDDP/stable/ https://docs.juliahub.com/General/Juniper/stable/

and that they should also check the package’s GitHub repository to see if there are other docs available.

Yes! A link to the original repo is very necessary.

Here's an example of a (now unmaintained) package of mine. There's nothing to link back (because why would you put a link to the GitHub in the GitHub repo :smile:)

https://docs.juliahub.com/General/MathOptFormat/stable/

DanielVandH commented 2 months ago

Has there been much development on this issue? I would agree with @odow that

We are planning on adding a top bar to indicate to the docs.juliahub.com visitor that they are reading automatically generated (fallback) documentation, and that they should also check the package’s GitHub repository to see if there are other docs available.

is extremely important. Especially now that package mentions of the form Pkg.jl autolink to JuliaHub on Zulip and Discourse, making this clear is crucial. A new user clicking one of these links and going to the package's documentation, only to see an unorganised collection of docstrings and a README with no mention at all that this is not the documentation, is not a good look and makes the package looks bad. It is not immediately clear at all that the documentation link is not the documentation but just a form of it autogenerated by JuliaHub.

As an example I clicked on a DataFrames.jl link that was generated on discourse and went to their documentation and was really confused at how bad it looked, before remembering it is the autogenerated version.

j-fu commented 2 months ago

I registered my packages with DocumentationGeneratorRegistry, but I see no effect on the presentation on Juliahub. How is this supposed to work ? Do I have to take further steps ?

EDIT: the link is updated after releasing a new package version. I guess this should be documented somewhere.

DanielVandH commented 2 months ago

Bump @Dattax. Just ran into this again and it's still frustrating to click the docs, see it's autogenerated, so that I just have to go back to the source and find the manual docs anyway since there is no point trying to read the autogenerated representation - on top of the other issues mentioned.

Dattax commented 2 months ago

We do have someone working on some of this right now - I will ping the team to see what the status is.

Dattax commented 2 months ago

So the team tells me that in the next 3 weeks we will release the following fixes:

  1. Remove the [Documentation] link to rehosted docs in the package page immediately for packages in General Registry
  2. Removing the ability for Google to index docs on JuliaHub (noindex tag on all generated docs)

We have a few other things we are discussing internally, but these are the updates for now.

j-fu commented 1 month ago

Hi, how is this going ?

Dattax commented 1 month ago

Hi, we have updated item 1 and 2 above and are waiting for our 6.7 release, which has been delayed a bit, but is expected very soon. Stay tuned...