Open daveverwer opened 9 months ago
Just to write up some progress on this. I started a branch that removes the “no-reference” redirects and attempts to host documentation from those URLs. So, for example:
/SwiftPackageIndex/SemanticVersion/documentation/semanticversion
Hosts the documentation from the files in S3 at:
s3://{bucket}/swiftpackageindex/semanticversion/0.4.0/
Unfortunately, the hosting-base-path
that we specify when we build documentation inserts a hard-coded /0.4.0/
reference into all of the generated files.
We need to generate a new set of documentation for the default documentation set, with a hosting-base-path
that does not include a reference and store it in a “special” latest
(or similar) directory in S3.
The issue here is that we don’t know what the “latest” version is at build time. It could be a default branch version, a pre-release, or a stable release.
Progress is in the stable-url-doc-hosting
branch.
As discussed, I've had a look into how many packages opt-in to generate docs but don't have any releases and the number is quite small: 14.
However, there's a different set of packages that is significantly larger that would currently be affected if we only ask Google to index packages with release docs, and that's packages that generate docs but have not had a release since they opted-into doc generation. They also don't have release docs. There are 95 of those, which is ~15% of all packages requesting docs (14 of 629).
While not ideal, I think as a first stab it'd ok if we didn't support having those indexed by Google off the bat since it's going to be quite tricky to do so. They're no worse off than they are currently, where we don't have them indexed by Google on any version, and the remedy is actually quite easy: simply tag a release.
Queries:
-- packages that generate docs on def branch and have no releases whatsoever
select p.url from
packages p join (
select v.package_id
from versions v
where v.package_id in (
-- has docs on default branch
select distinct v.package_id
from versions v
where
v.spi_manifest::text like '%documentation_targets%'
and latest is not null
)
group by v.package_id
having count(*) = 1
) t on p.id = t.package_id
order by p.url
-- packages that generate docs but have no latest release docs
select distinct p.url
from packages p
join versions v on v.package_id = p.id
where v.spi_manifest::text like '%documentation_targets%'
and latest is not null
group by p.url
having count(*) = 1
order by p.url
packages that generate docs but have not had a release since they opted-into doc generation
We could kick off re-builds of these for the latest stable release.
Unfortunately that won't work, because they're not opted into doc generation on those old tags. Only a new release would actually have an .spi.yml
file with the doc targets set.
Unfortunately that won't work, because they're not opted into doc generation on those old tags. Only a new release would actually have an
.spi.yml
file with the doc targets set.
Ah, of course. That's a shame.
I've thought of a pretty simple way for us to determine from within the builder itself whether a default branch build with docs should generate the docs for "latest release" or not. We can just list the tags and run them through SemanticVersion
, just like we do in analysis, and if there are none, the branch build is a "latest release" doc set.
It's perhaps not 100% ideal in that the data doesn't come from the server but then again the source of truth is actually the repository we've checked out, so we're definitely looking in the right place.
Yes that would work.
One thing we should consider is sending back with the API call and storing in our database is whether it generated a latest
version. We will need to know definitively whether to add a nofollow
and do a redirect, or a direct link to the latest
documentation.
We should also not use the word latest
as the directory name in the AWS bucket, either. That's a perfectly valid name for a default branch, and I can see projects using that as a branch name. Maybe !!!default
or ___default
or !!!latest
or ___latest
or something like that? Still all valid branch names but the chances of a collision is much less.
From https://git-scm.com/docs/git-check-ref-format
- They cannot have ASCII control characters (i.e. bytes whose values are lower than \040, or \177 DEL), space, tilde
~
, caret^
, or colon:
anywhere.
So we could do ~latest
to be safe
I've looked into this a bit this morning on the builder side and there are a few tricky parts we need to consider:
The builder part is going to be fiddly, because we need to essentially run doc generation twice for tags. The complication is that we need to interleave this with uploading and doc reporting. We also need to figure out on which doc gen to report back (probably just the existing reference
one).
Running two doc builds will increase the run time and make us move closer to our 10min build time budget. I've had a look and the slowest total build duration we've currently recorded is 6mins. That is probably OK (most of that time is probably the build itself and doubling the doc gen will likely be OK).
Normally, it would be a problem to only look at the "survivors" (i.e. those that managed to report back within the 10min deadline) but in this case those that take longer are already out of scope, so making it take even longer is not a problem. However, given that we're planning some infra changes we may actually pull them (if there are any, it's hard to tell) back into the 10min window and then we'd potentially knock them out again. Perhaps not a huge issue but just something to be aware of.
My biggest concern was the linkable-paths.json
file that we report back. I was worried that it might contain references to the reference
we're building. If that was the case we would have to report the one with the real reference (i.e. say 1.2.3
vs ~latest
), because otherwise all past release versions would have the same references in the file. Luckily the file doesn't contain any reference to reference
so we're OK here. (It might also not be a problem in the first place due to the way we're using linkable-paths.json
- it would be used to point to the ~latest
anyway.)
However just generally we have to be careful in deciding for which tag doc set we report back. While they're equivalent in terms of doc content they will contain different variants and we're going to save these in the Version
object.
Overall this is quite the change to how we're generating docs and I wish there was an alternative way to duplicate a doc set that we could simply tack on to the existing process.
It's really unfortunate that the doc archives have structural components embedded that don't make them re-hostable. I don't know if this is fundamentally impossible to change but I wonder if it'd be worth at least bringing up with the docc folks. Maybe there's an upstream change possible such that the doc gen complexities on our end could at least be only temporary if not outright avoided.
Another route worth exploring: Right now we generate docs as follows:
swift package generate-documentation
xcodebuild docbuild
docc process-archive
It is my understanding that both processes essentially call out to docc
under the hood.
I'm pretty sure we could also generate docs as follows in case of SwiftPM based builds:
docc convert
docc process-archive
The advantage would be that both now have the same second stage, docc process-archive
and I believe that's the stage that actually takes the hosting base path parameter. In fact we know this, because xcodebuild docbuild
doesn't take one. The info in the doc archive after xcodebuild docbuild
must be free of base path parameters.
If docc convert
is equivalent to xcodebuild docbuild
and both produce the same input archive for docc process-archive
, we could make this whole process easier.
Both paths of doc generation would generate a doc archive and then we run two passes of docc process-archive
to write out two different doc sets with adjusted base paths. It would save us having to re-run doc generation for each target and do all the merging.
The downside of this process is that we might be diverging from how users generate docs and therefore make it harder to compare results in case there are problems (probably not a huge downside tbh.
It still feels like we're fighting a downstream problem that could perhaps be better addressed upstream in docc. For example, I've generated docs for the same package twice purely with different base paths. The only difference in the output was the base paths in the index.html
files. All other files were the same (ignoring JSON key order randomness).
If somehow instead of taking full base paths the index.html
files were based on a configurable variable, we would be able to drive any hosted archive off of any base path we choose either by injecting it at generation time and duplicating it or, ideally, dynamically at runtime by injecting a base path parameter when we fetch the docs. (I'm not going to suggest rewriting the index.html
files on the fly 😬) I'm not sure how routing works in the JS app but maybe it'd be flexible enough to read the base path from a single location and augment all paths with it?
Pinging @ethan-kusters and @franklinsch et al - is this something worth discussing?
docc process-archive
does something with a doc archive generated by our builder but it's not a result we would be able to rely on. It seems to be rewriting urls but it's also trimming most of the link
and script
tag content. It also doesn't rewrite the top level index.html
file in documentation
.
For reference, I ended up running
xcrun docc process-archive transform-for-static-hosting checkout/.docs/swiftpackageindex/semanticversion/0.4.0 --output-path checkout/.docs/swiftpackageindex/semanticversion/~release --hosting-base-path swiftpackageindex/semanticversion/~release
on an archive I created via
swift run builder generate-docs -s 5.9 -p macos-spm -c https://github.com/SwiftPackageIndex/SemanticVersion.git -r 0.4.0 --targets SemanticVersion -f
and the diff of one of the index.html
files looked as follows (after running each file through tidy
for easier diffing):
11c11
< "/swiftpackageindex/semanticversion/0.4.0/favicon.ico">
---
> "/swiftpackageindex/semanticversion/~release/favicon.ico">
13c13
< "/swiftpackageindex/semanticversion/0.4.0/favicon.svg" color=
---
> "/swiftpackageindex/semanticversion/~release/favicon.svg" color=
18c18
< var baseUrl = "/swiftpackageindex/semanticversion/0.4.0/"
---
> var baseUrl = "/swiftpackageindex/semanticversion/~release/"
19a20,27
> <script defer="defer" src=
> "/swiftpackageindex/semanticversion/~release/js/chunk-vendors.bdb7cbba.js"
> type="text/javascript">
> </script>
> <script defer="defer" src=
> "/swiftpackageindex/semanticversion/~release/js/index.2871ffbd.js"
> type="text/javascript">
> </script>
21,135c29
< "/swiftpackageindex/semanticversion/0.4.0/css/chunk-c0335d80.10a2f091.css"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/css/documentation-topic.1d1eec04.css"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/css/documentation-topic~topic.b6287bcf.css"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/css/documentation-topic~topic~tutorials-overview.d6f5411c.css"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/css/topic.d8c126f3.css"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/css/tutorials-overview.c249c765.css"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/chunk-2d0d3105.cd72cc8e.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/chunk-c0335d80.76a68cc5.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/documentation-topic.57e91f8a.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/documentation-topic~topic.1679ec90.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/documentation-topic~topic~tutorials-overview.90c61522.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-bash.1b52852f.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-c.d1db3f17.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-cpp.eaddddbe.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-css.75eab1fe.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-custom-markdown.7cffc4b3.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-custom-swift.5cda5c20.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-diff.62d66733.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-http.163e45b6.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-java.8326d9d8.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-javascript.acb8a8eb.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-json.471128d2.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-llvm.6100b125.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-markdown.90077643.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-objectivec.bcdf5156.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-perl.757d7b6f.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-php.cc8d6c27.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-python.c214ed92.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-ruby.f889d392.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-scss.62ee18da.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-shell.dd7f411f.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-swift.84f3e88c.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/highlight-js-xml.9c3688c7.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/topic.8cd0c0c4.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/tutorials-overview.2a32cd6f.js"
< rel="prefetch">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/css/index.038e887c.css"
< rel="preload" as="style">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/chunk-vendors.ba2dd0cb.js"
< rel="preload" as="script">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/js/index.e8a5d294.js"
< rel="preload" as="script">
< <link href=
< "/swiftpackageindex/semanticversion/0.4.0/css/index.038e887c.css"
---
> "/swiftpackageindex/semanticversion/~release/css/index.ff036a9e.css"
137,139d30
< <style type="text/css">
< .noscript{font-family:"SF Pro Display","SF Pro Icons","Helvetica Neue",Helvetica,Arial,sans-serif;margin:92px auto 140px auto;text-align:center;width:980px}.noscript-title{color:#111;font-size:48px;font-weight:600;letter-spacing:-.003em;line-height:1.08365;margin:0 auto 54px auto;width:502px}@media only screen and (max-width:1068px){.noscript{margin:90px auto 120px auto;width:692px}.noscript-title{font-size:40px;letter-spacing:0;line-height:1.1;margin:0 auto 45px auto;width:420px}}@media only screen and (max-width:735px){.noscript{margin:45px auto 60px auto;width:87.5%}.noscript-title{font-size:32px;letter-spacing:.004em;line-height:1.125;margin:0 auto 35px auto;max-width:330px;width:auto}}#loading-placeholder{display:none}
< </style>
142,148c33
< <noscript>
< <div class="noscript">
< <h1 class="noscript-title">This page requires JavaScript.</h1>
< <p>Please turn on JavaScript in your browser and refresh the page
< to view its content.</p>
< </div>
< </noscript>
---
> <noscript>[object Module]</noscript>
150,156d34
< <script src=
< "/swiftpackageindex/semanticversion/0.4.0/js/chunk-vendors.ba2dd0cb.js"
< type="text/javascript">
< </script><script src=
< "/swiftpackageindex/semanticversion/0.4.0/js/index.e8a5d294.js"
< type="text/javascript">
< </script>
I've tested the overhead of generating a ~release
doc set in addition to a normal set for a tag for our largest doc set: swift-syntax:
It adds around 2 minutes of additional time, including all doc generation and uploading the 128 MB of docs. Subsequent processing doesn't impact our time limit and it happens asynchronously.
The total time for swift-syntax is just under 6 minutes, so we're well clear of the 10min limit here. NB: swift-syntax is likely one of the most critical packages but there's a chance that a package with a slower build time might be more at risk going over the limit.
I'll try to get typical build times for packages with docs.
FYI, I've chosen ~release
as the name in case we want to at some point manage refs to the latest docs for any of the other significant versions as well: ~release
, ~preRelease
(~pre-release
), ~defaultBranch
(~default-branch
).
The ~
, however, poses a problem when pushing the files to S3. We'll either need to figure out how to properly encode it or choose some other way to avoid branch name collisions:
New run timed out, this is going to be a problem: https://gitlab.com/finestructure/swiftpackageindex-builder/-/jobs/6100021479
We could of course increase the timeout but the problem with that is that it'll then cause more trouble when he hit a slow build and make the delays worse. However, it should be possible to set the timeout dynamically based on package details such that we could give only the packages that are generating docs more time.
That in combination with the new machines should prevent us from running into timeout problems here.
Looking at the slowest doc builds, swift-syntax isn't actually in the top 10:
build_duration | platform | swift version | runner_id | builder_version | package_name | reference | latest | job_url |
---|---|---|---|---|---|---|---|---|
360.6515439748764 | macos-spm | 5.9 | J1XnyXFH | 4.28.8 | AppStoreConnect | 0.4.1 | release | https://gitlab.com/finestructure/swiftpackageindex-builder/-/jobs/6036436810 |
282.830687046051 | macos-spm | 5.9 | TDmZkXJm | 4.28.8 | AppStoreConnect | main | default_branch | https://gitlab.com/finestructure/swiftpackageindex-builder/-/jobs/6036436784 |
197.28208303451538 | macos-spm | 5.9 | J1XnyXFH | 4.28.9 | Vercel | main | default_branch | https://gitlab.com/finestructure/swiftpackageindex-builder/-/jobs/6039196540 |
158.17895805835724 | macos-spm | 5.9 | J1XnyXFH | 4.28.8 | MetaCodable | main | default_branch | https://gitlab.com/finestructure/swiftpackageindex-builder/-/jobs/6036285445 |
148.69907307624817 | macos-spm | 5.9 | TDmZkXJm | 4.28.7 | Verge | main | default_branch | https://gitlab.com/finestructure/swiftpackageindex-builder/-/jobs/5970370087 |
133.36819994449615 | ios | 5.9 | J1XnyXFH | 4.28.9 | Sublimation | 2.0.0-alpha.1 | pre_release | https://gitlab.com/finestructure/swiftpackageindex-builder/-/jobs/6092051981 |
131.16940808296204 | macos-spm | 5.9 | J1XnyXFH | 4.28.9 | swift-composable-architecture | 1.7.3 | release | https://gitlab.com/finestructure/swiftpackageindex-builder/-/jobs/6100964488 |
128.8557449579239 | macos-spm | 5.9 | J1XnyXFH | 4.28.9 | swift-openapi-request-dl | 1.0.0 | release | https://gitlab.com/finestructure/swiftpackageindex-builder/-/jobs/6069665455 |
128.55133306980133 | macos-spm | 5.9 | TDmZkXJm | 4.28.9 | swift-composable-architecture | main | default_branch | https://gitlab.com/finestructure/swiftpackageindex-builder/-/jobs/6090102780 |
120.5449548959732 | macos-spm | 5.9 | J1XnyXFH | 4.28.9 | swift-otel | main | default_branch | https://gitlab.com/finestructure/swiftpackageindex-builder/-/jobs/6112471909 |
The slowest at 6min is actually already sitting at 7.5min job duration (due to cloning, reporting etc overhead), so we're awfully close or over if we duplicate doc generation.
FYI, I've had to change the url fragment to _release
in order to work around the S3 upload issues.
I've looked into the documentation routing issue we discussed on Monday and unless I'm mistaken (which I hope 😅), the suggested solution outlined in the Custom Routing docs and in David's WWDC video won't work for us.
The problem is that the example deals with routing to a single doc archive on a site. For example, translated to our site for the package SemanticVersion
, we have the following incoming request:
[ INFO ] GET /SwiftPackageIndex/SemanticVersion/0.4.0/documentation/semanticversion [component: server, request-id: D637AAA3-284B-46D5-BDBF-AEC5C03F842D]
If I route this to a doc archive without a base path (i.e. generated simply via xcodebuild docbuild
), the webapp tries to make subsequent requests from /js
, /css
etc:
[ INFO ] GET /js/chunk-vendors.bdb7cbba.js [component: server, request-id: 35F8B45D-EF47-45E8-B8CB-44F930DCB576]
Now in the case of a single doc site, the custom routing docs simply route all /js
, etc requests to the doc archive:
# Route files within the documentation archive.
RewriteRule ^(css|js|data|images|downloads|favicon\.ico|favicon\.svg|img|theme-settings\.json|videos)\/.*$ SlothCreator.doccarchive/$0 [L]
However, we can't do that, because we're hosting hundreds of archives, and different versions, and so we need the base path to know which doc archive to route to.
I've cross-posted this to the DocWG's slack here: https://swift-open-source.slack.com/archives/C04PCMXMBD0/p1707998191445999
I've created a branch no-redirect
based on the doc re-writing changes in rewrite-doc-index-html
and stable-url-doc-hosting
that eliminates the redirects off the "canonical url", i.e. http://localhost:8080/SwiftPackageIndex/SemanticVersion/documentation
is not a redirect anymore.
The rewriting seems to kick in ok (looking at the source), however the Vue app ends up in an error state for some reason:
Not sure what's going on there. There are no errors in the console (unless I'm looking in the wrong place) and there are no 404s or anything in the server logs either. Needs more investigation.
What's interesting is that
curl -s http://localhost:8080/SwiftPackageIndex/SemanticVersion/0.4.0/documentation/semanticversion
and
curl -s http://localhost:8080/SwiftPackageIndex/SemanticVersion/documentation/semanticversion
return the same html except for
<link rel="canonical" href="/SwiftPackageIndex/SemanticVersion/0.4.0/documentation/semanticversion" />
and the former is displaying correctly.
I have a working doc hosting setup now from a stable URL via rewrites that doesn't require us to regenerate docs nor redirect. The one downside is that we need an additional "anchor" in the doc url in order to distinguish doc routes and make them routable in our DocC proxy.
Doc urls with references are unchanged:
✅ http://localhost:8080/SwiftPackageIndex/SemanticVersion/0.4.0/documentation/semanticversion
Doc index.html
snippet:
var baseUrl = "/swiftpackageindex/semanticversion/0.4.0/"
</script>
<link href="/swiftpackageindex/semanticversion/0.4.0/css/chunk-c0335d80.10a2f091.css" rel="prefetch"/>
Default docs could be hosted as
✅ http://localhost:8080/SwiftPackageIndex/SemanticVersion/current/documentation/semanticversion
Doc index.html
snippet:
var baseUrl = "/swiftpackageindex/semanticversion/current/"
</script>
<link href="/swiftpackageindex/semanticversion/current/css/chunk-c0335d80.10a2f091.css" rel="prefetch"/>
I was hoping to get
❌ http://localhost:8080/SwiftPackageIndex/SemanticVersion/documentation/semanticversion
to work, but it doesn't. The problem here is that documentation
cannot be part of the base path as it's part of the docc url. For example, there's also tutorial/...
. So if we wanted documentation
to be the "anchor" we'd actually have urls like
✅ http://localhost:8080/SwiftPackageIndex/SemanticVersion/documentation/documentation/semanticversion
This does work but feels like an odd url.
In general, any path element of our choosing will do. For instance _
would also work
✅ http://localhost:8080/SwiftPackageIndex/SemanticVersion/_/documentation/semanticversion
The reason we can't really drop the "anchor" is that it would overlay resource paths with our existing resources. For example, the list of DocC resource paths is
/documentation/**
/tutorial/**
index.html
/css/**
/data/**
/images/**
/img/**
/js/**
Some of these collide with our static resources. I have not tried but I could imagine we might be able to make this work if we either moved our resources to another path or in our routes checked for resources among both docc and our static resources when for example serving css/**
.
If we did the latter, we'd be mapping {owner}/{package}/index.html
to be the doc page, which should work because we don't really reference package pages via index.html
.
We'd also have to ensure that the actual resources have different file names (likely but hard to control since we don't control DocC resource file names).
However, we'd now be mixing the rather messy DocC proxy routes with our existing routes, creating a bigger mess. Figuring out if a js/**
404 is due to a missing static resource, a missing docc resource, or a messed up route suddenly becomes much more difficult to tell. Or rather impossible without debugging into it. It just doesn't feel like a good solution.
Unless I'm missing another option I think we'd have to move our static resources to another base path if we wanted to avoid an additional anchor in our doc urls. Given that, I think I'd opt for
✅ http://localhost:8080/SwiftPackageIndex/SemanticVersion/_/documentation/semanticversion
as the canonical doc url. _
is not too bad and also even unlikelier to collide with a branch name than even current
or latest
, release
, or something similar.
Finally, there is some value in being able to tell from the url which part of the routing handles it. I.e. we'd know that any _
route comes from our DocC proxy. That's a separation we'd lose even if we avoided the messier resource overlay by moving our static resources out of the way.
PR #2961 is in preparation for this change. I've run the following additional manual tests to ensure all doc urls keep working:
rester-sitemap.swift https://swiftpackageindex.com/{owner}/{repo}/sitemap.xml
(i.e. from PROD) using Rester-sitemap - this generates a restfile for each url listed in the sitemap and should give us full coverage of all doc urls${base_url}
parameter and update the tested return codes (200 instead of 301 where applicable etc)These files (attached below) can then be against DEV via
env base_url=https://staging.swiftpackageindex.com \
rester doctest-SemanticVersion-partial.restfile
env base_url=https://staging.swiftpackageindex.com/SwiftPackageIndex/SemanticVersion/0.4.0 \
rester doctest-SemanticVersion.restfile
env base_url=https://staging.swiftpackageindex.com/SwiftPackageIndex/SemanticVersion/~ \
rester doctest-SemanticVersion.restfile
env base_url=https://staging.swiftpackageindex.com \
rester doctest-HandySwift-partial.restfile
env base_url=https://staging.swiftpackageindex.com/FlineDev/HandySwift/4.0.1 \
rester doctest-HandySwift.restfile
env base_url=https://staging.swiftpackageindex.com/FlineDev/HandySwift/~ \
rester doctest-HandySwift.restfile
As a package maintainer, I just want to say I am very excited for this change.
Currently, if I want to link to SPI docs from GitHub, I would have to include the specific version in the URL, which means I would have to constantly keep the link up-to-date with every release.
Instead, I just link to the docs for main
to avoid having to do that, but it's not ideal. So a single canonical URL that never changes and always points to the latest release would be awesome. 😊
It's very nearly there, which is great. It'll be a good step forward.
However even now, those links work. They are currently a redirect and they will become the canonical location, but you can update your links now. So for Foil, for example, you'd link to:
https://swiftpackageindex.com/jessesquires/Foil/documentation/foil
Currently, that's a redirect to either the latest stable docs with a fallback to latest pre-release or latest on the default branch if the others don't exist. Those links have existed since launch, but its hard to let people know they exist 😬
(You can even drop the target name at the end: https://swiftpackageindex.com/jessesquires/Foil/documentation)
@daveverwer @finestructure ohhhh I see. (I misread the initial issue description.)
I had no idea these URLs were available. This is great! 😄
We currently host documentation on a variety of URLs:
[owner]/[repo]/main/documentation/package
[owner]/[repo]/0.1.0/documentation/package
[owner]/[repo]/0.1.0-pre1/documentation/package
and every time there is a new release, instead of redirecting to
/0.1.0/documentation/package
we now start redirecting to/0.2.0/documentation/package
. We always set the canonical URLs to the new version and update sitemaps to point at the new version, but this is causing a huge amount of churn in what we are asking Google to index and is contributing to our ongoing search index issues.We need to host documentation like this:
[owner]/[repo]/documentation/package
This is the canonical URL for a package's documentation, and is not a redirect. This should have a canonical URL meta tag and header pointing to itself and be marked for indexing.
[owner]/[repo]/[reference]/documentation/package
This is a canonical path to a reference specific documentation set, but should not be marked as canonical and should be excluded from Google indexing with a
noindex
tag and header. These pages should all point their canonical URL to the page above.Notes:
Steps
[owner]/[repo]/documentation/package
URL[owner]/[repo]/documentation/package
URLnoindex
via a meta tag and HTTP header on every documentation page apart from the canonical page