Open LilithHafner opened 4 months ago
These are web server redirect and not caused by Documenter:
$ curl -fsSI https://documenter.juliadocs.org/stable/man/guide
HTTP/2 301 <--
server: GitHub.com
content-type: text/html
location: https://documenter.juliadocs.org/stable/man/guide/ <--
Edit: This should have been posted to the Documenter issue (https://github.com/JuliaDocs/Documenter.jl/issues/2473).
It is caused by what files generated by Documenter are called and where they are stored.
The root cause of this is that Documenter.jl's file structure names all files index.html and places them like path/to/guide/index.html
(e.g. https://github.com/LilithHafner/Chairmarks.jl/tree/gh-pages/v1.0.2) while when using the DocumenterVitepress.jl output format that would be stored as path/to/guide.html
(e.g. https://github.com/LilithHafner/Chairmarks.jl/tree/gh-pages/v1.1.0)
I think the latter is a slightly better approach, but I don't know how to transition from the former to the latter without breaking almost all links.
I'm quite confused here.
https://chairmarks.lilithhafner.com/v1.0.2/explanations (which does not use DocumenterVitepress.jl) redirects to https://chairmarks.lilithhafner.com/v1.0.2/explanations/.
I always thought that these two URLs are equivalent, even without any "redirect" by the server, but in any case https://github.com/LuxDL/DocumenterVitepress.jl/issues/64#issuecomment-1979040966 indicates that that Github normalizes the URL to have the slash at the end. If that's the case, it makes sense for Documenter to take that into account and always produce links with a trailing slash
The link https://chairmarks.lilithhafner.com/v1.1.0/tutorial/explanations/, which uses DocumenterVitepress.jl gives a 404 error.
I think that's just because the "explanations" doesn't exist, right? I don't know what happened between 1.02. and 1.1.0, but this doesn't seem related to whether DocumenterVitepress
produces slashes at the end of URLs
The page that does exist is https://chairmarks.lilithhafner.com/v1.1.0/tutorial.html
As far as I can tell, this is what vanilla Documenter would have produced with the prettyurls=true
, right?
Does DocumenterVitepress
have an option equivalent to prettyurls
?
I very much think prettyurls
are preferable. Since DocumenterVitepress
is a fresh start, I would actually recommend only supporting the prettyurls=true
behavior. That is, I think DocumenterVitepress
should only produce index.html
files in folders, never a page.html
. I understand that means you have to run a webserver to view the documentation locally, but people should really get used to that idea. Having different behavior when building locally vs when deploying will inevitably blow up in somebody's face, and I think having the prettyurls
behavior is better, as it allows putting page-local assets in the same folder as the index.html
.
Unfortunately this is not something that Vitepress supports: https://vitepress.dev/guide/routing#generating-clean-url
I always thought that these two URLs are equivalent
Not quite - without the trailing slash, it's up to the server. With the trailing slash, it auto resolves to $path/index.html
.
so one can link to e.g. https://chairmarks.lilithhafner.com/v1.1.0/tutorial and that works fine, but https://chairmarks.lilithhafner.com/v1.1.0/tutorial/ will error because there is no folder and no index.html.
I could manually generate a redirect page in tutorial/index.html
, which goes to tutorial.html
, which would probably work.
Vitepress doesn't actually support the exact structure of Documenter, but Github Pages allows /tutorial
to resolve to /tutorial.html
, so the links are semantically fine.
A lot of the work here was for the markdown backend in any case, so it should be fairly easy to switch to another backend or upstream to DocumenterMarkdown at some point after we have this working well and understand how to better support other static site generators.
Oh, that's quite interesting! Thanks for that explanation!
I still think you should change the behavior to always write only tutorial/index.html
because that allows the index.html
to reference, e.g., a plot.png
file in the same folder. So that gives you a lot more flexibility for processing non-trivial sources. Plus, you're guaranteed that the "pretty" URLs work, independent of the server configuration. But, you know, whatever works for you :-)
I don't know if this is a bug: In
https://github.com/LilithHafner/Chairmarks.jl/tree/main, I don't see that the cleanURLs
option is set anywhere. Yet, the links in the navigation bar on https://chairmarks.lilithhafner.com/v1.1.0/tutorial.html are all to https://chairmarks.lilithhafner.com/v1.1.0/why etc. (without the .html
), which wouldn't work if it wasn't hosted on the right server.
All of the .vitepress
files are copied from https://github.com/DocumenterVitepress.jl/tree/main/template, so if the user does not explicitly override by supplying their own file, the default file is used (which does set cleanURLs
).
In the interim, used this script to setup 200 redirects from https://chairmarks.lilithhafner.com/v1.1.0/tutorial/ to https://chairmarks.lilithhafner.com/v1.1.0/tutorial.
function fix(root_url, root_path=".")
for (root, dirs, files) in walkdir(root_path)
for file in files
name, ext = splitext(file)
if ext === ".html" && name ∉ ("404", "index")
dir = joinpath(root, name)
if !isdir(dir)
mkdir(dir)
url = "https://"*normpath(joinpath(root_url, root, name))
open(joinpath(dir, "index.html"), "w") do io
write(io, """
<!DOCTYPE html>
<meta charset="utf-8">
<title>Redirecting to $url</title>
<meta http-equiv="refresh" content="0; URL=$url">
<link rel="canonical" href="$url">""")
end
end
end
end
end
end
fix("chairmarks.lilithhafner.com")
I still think you should change the behavior to always write only tutorial/index.html because that allows the index.html to reference, e.g., a plot.png file in the same folder. So that gives you a lot more flexibility for processing non-trivial sources.
No that does not give any additional flexibility. Assets stored in the source directory like this:
docs
└── src
├── asset.png
└── page.md
Build to this using Documenter:
_build
├── asset.png
└── page
└── index.html
(e.g. https://github.com/JuliaDocs/Documenter.jl/blob/gh-pages/v1.3.0/man/hosting/walkthrough/index.html)
Building to a directory structure that matches the source directory structure will not introduce name conflicts.
_build_vitepress
├── asset.png
└── page.html
Plus, you're guaranteed that the "pretty" URLs work, independent of the server configuration.
Vitepress could (should?) add symlinks from /tutorial
to /tutorial.html
to also support servers that don't do that resolution automatically.
There's no substantive technical reason to prefer https://chairmarks.lilithhafner.com/v1.1.0/tutorial vs https://chairmarks.lilithhafner.com/v1.1.0/tutorial/ vs https://chairmarks.lilithhafner.com/v1.1.0/tutorial.html. I agree with Vitepress's style choice to use https://chairmarks.lilithhafner.com/v1.1.0/tutorial for the reasons I gave in https://github.com/JuliaDocs/Documenter.jl/issues/2473#issue-2169490365.
Github servers happen to redirect 404s at path/to/file
to path/to/file/index.html
and not the other way around, but that I don't think that is particularly important.
No that does not give any additional flexibility
I meant that in general. I routinely use setups where generated assets get put in the same folder as the index.html
, e.g., on my website (which runs on a handwritten generator) and in QuantumControlExamples.jl (via Literate.jl
)
Assets stored in the source directory like this:
docs └── src ├── asset.png └── page.md
That would be a shared assets over multiple pages. I wouldn't mind if we added support in Documenter for a structure like
docs
└── src
├── shared_asset.png
├── page.md
└── page
└── local_asset.png
which then maybe could be referenced with something like ![text](@__DIR__/asset.png)
. Or maybe it's good enough to stick to
docs
└── src
├── shared_asset.png
└── page
├── index.md
└── local_asset.png
which already works (and which I'm using in QuantumControlExamples
).
Vitepress could (should?) add symlinks from /tutorial to /tutorial.html
Does that work? Do webservers follow symlinks in this way, ignoring the file extension?
There's no substantive technical reason to prefer
https://chairmarks.lilithhafner.com/v1.1.0/tutorial
vshttps://chairmarks.lilithhafner.com/v1.1.0/tutorial/
If https://github.com/LuxDL/DocumenterVitepress.jl/issues/64#issuecomment-1979655394 is correct
Not quite - without the trailing slash, it's up to the server. With the trailing slash, it auto resolves to $path/index.html.
then the difference is that "the URL for a folder load the index.html
in that folder" is universal, whereas "the URL for a file without an extension loads that file with .html
appended OR the equivalent folder, whichever is available" is not.
Unless the symlink
solution works, it seems like DocumenterVitepress
switching to folder/index.html
is actually the only way (certainly the most robust way) to solve the issue "Trailing slash gives 404". This does not preclude a setting that then uses links like https://chairmarks.lilithhafner.com/v1.1.0/tutorial
without a trailing slash in the sidebar etc. (if you can guarantee that the server hosting the docs supports that).
Documenter could have such an option as well, I'd be perfectly fine with that (but it shouldn't be the default, as it limits the servers that can host the docs). I have no problem with anyone preferring the non-slash URLs on an aesthetic basis.
Actually, it might be relevant to check if LiveServer.jl
and python -m http.server
can handle URLs without slashes. If not, that would make local preview quite difficult.
P.S.: Just tried LiverServer
and python -m http.server
and they both can handle forwarding https://chairmarks.lilithhafner.com/v1.1.0/tutorial
to https://chairmarks.lilithhafner.com/v1.1.0/tutorial/
, but not to https://chairmarks.lilithhafner.com/v1.1.0/tutorial.html
. So Documenter would actually be fine if a "no-slash" option were to be added in combination with prettyurls=true
(but not with prettyurls=false
). DocumenterVitepress
seems like it's pretty difficult to preview locally, as none of the recommended local servers implement its default URL scheme.
Let's try to keep this issue focused on the fact that adding a trailing slash gives a 404 error in Vitepress.
If DocumenterVitepress's links are broken on some webservers (including local servers), I imagine that's something the authors of this package would love to hear about but I request you open a new issue for that.
This issue only effects folks transitioning from default Documenter.jl, and only effects them in the transition period while extant links still point to the old URLs. That said, many DoculmenterVitepress/Documenter.jl users will be transitioning from default Documenter.jl, and the transition period has an unbounded length.
@asinghvi17 suggested a solution here
I could manually generate a redirect page in tutorial/index.html, which goes to tutorial.html, which would probably work.
And I implemented and deployed it on Chairmarks here.
This is a hackey solution. Would you welcome a PR that adds a redirect_trailing_slash configuration option that can be true
, false
, or :auto
(default) where auto detects if any previous builds using trailing slash links exist and if so (or if the option is true), runs the hack from here to add 200 redirects?
@goerz: I'm not sure what benefit that folder structure provides aside from avoiding namespacing issues? The asset wouldn't be loaded unless the page calls for it, in any case, and Vitepress tends to inline any included images as opposed to including them in the output. This doesn't seem to significantly impact load time (see https://beautiful.makie.org for an example).
@LilithHafner: Yes that would be great! It seems to work for you already :) but how would you do the detection? One could manually check the deployurl
that's given in the settings I suppose, or check the gh-pages
branch (which seems like it would be pretty slow...)
how would you do the detection?
lol, idk. I'll think about that. I only really care about trailing slash handling on the deployed docs, and when deploying we need access to gh-pages anyway. OTOH, it's good (necessary) to build the exact same docs locally and hosted, because otherwise what is even the point of local builds?
Also, my current hack discards fragments. I'll look into that, too.
@goerz: I'm not sure what benefit that folder structure provides aside from avoiding namespacing issues?
It allows you to keep the exact URL scheme you have now (without the trailing slashes), but without requiring any hacks or server features. If you render tutorial.md
into tutorial/index.html
, then both https://chairmarks.lilithhafner.com/v1.1.0/tutorial
and https://chairmarks.lilithhafner.com/v1.1.0/tutorial/
work with any server. Thus, it seems like the most elegant way of solving the "404 issue" this issue is about. Nothing would change compared to the current DocumenterVitepress
experience or existing URLs: you can use the preferred URLs without the slash in the sidebar etc., but if someone follows an old URL from vanilla-Documenter with the slash, that'll also work without any kind of hack.
This issue only effects folks transitioning from default Documenter.jl,
That is a very good point! Even if someone implemented a PR for https://github.com/JuliaDocs/Documenter.jl/issues/2473 to make Documenter prefer URLs without a slash, that doesn't change existing pages, so that's going to be a problem for any project transitioning to DocumenterVitepress
. The proposed workaround is to generate a structure like
⁞
├── tutorial
│ └── index.html
├── tutorial.html
⁞
where the index.html
redirects to tutorial.html
. That should work, but it feels quite ugly, and to the best of my understanding, having only tutorial/index.html
would have the exact same effect without requiring any redirects.
I was also worried about whether the site can be previewed locally using LiveServer
or python -m http.server
. Strangely, that seemed to work for the most part when I just tested it just now (with DocumenterVitepress
's current system). I don't quite understand why – in earlier testing it seemed like the local servers couldn't translate tutorial
into tutorial.html
. Maybe there's some JS magic in the background? Anyway, for whatever solution you end up with, making sure that it works for preview seems like an important consideration.
Vitepress tends to inline any included images as opposed to including them in the output. This doesn't seem to significantly impact load time (see https://beautiful.makie.org for an example).
Huh. I'm surprised it doesn't affect load time. I would have expected this to cause pretty huge .html
documents that most browser would be less efficient at loading, and that also might hurt SEO. But this is a total tangent, though, and we should probably keep this thread more focused :-)
It allows you to keep the exact URL scheme you have now
Almost, but the sites without trailing slashes 301 redirect to the with slash alternatives.
$ curl https://chairmarks.lilithhafner.com/v1.0.2/tutorial
<html>
<head><title>301 Moved Permanently</title></head>
<body>
<center><h1>301 Moved Permanently</h1></center>
<hr><center>nginx</center>
</body>
</html>
$ curl curl https://chairmarks.lilithhafner.com/v1.1.0/tutorial
<!DOCTYPE html>
<html lang="en-US" dir="ltr">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width,initial-scale=1">
<title>Tutorial | Chairmarks.jl</title>
<meta name="description" content="A VitePress Site">
...
(Chairmarks v1.0.2 and below uses Dcoumenter.jl without Vitepress and 1.1.0 and above uses Documenter.jl and DocumenterVitepress.jl)
https://chairmarks.lilithhafner.com/v1.0.2/explanations (which does not use DocumenterVitepress.jl) redirects to https://chairmarks.lilithhafner.com/v1.0.2/explanations/.
The link https://chairmarks.lilithhafner.com/v1.1.0/explanations/, which uses DocumenterVitepress.jl gives a 404 error.
This caused all
https://chairmarks.lilithhafner.com/stable/.../
links I and others previously posted to break when switching to DocumenterVitepress.jl. I agree with the DocumenterVitepress.jl choice to prefer not using trailing slashes, but URLs with trailing slashes should redirect, not 404.