Open JCGoran opened 1 year ago
I absolutely agree that this is a useful feature, but I'm don't necessarily think it's something that should be natively supported by pdoc. Here's what I would do instead:
Upload your docs to S3-compatible object storage with a directory structure like this:
1.0.0/...
2.0.0/...
latest/...
Future uploads can be done as part of CI.
Write a small JavaScript snippet that enumerates the bucket for all versions (see here for some similar code).
Add some HTML/JS to pdoc's template which adds a version selector based on the JavaScript snippet.
It'd be awesome to have a custom template in examples/
with a snippet that does 2)! 😃
Hello 👋 (TL;DR below)
While I agree that this functionality is not necessarily in the scope of pdoc, I'm afraid there's a bit more to it than meets the eye. The problem with your suggestion is that I'd never be able to change the design and layout of the docs.
I am actually already publishing docs for multiple versions of https://gitlab.com/dAnjou/fs-code here: https://danjou.gitlab.io/fs-code (well, just two versions for now). Much like you're describing, I have a CI job triggering the whole thing via a Python script and I have a bit of JavaScript in a customized pdoc template, all the code is here: https://gitlab.com/dAnjou/fs-code/-/blob/main/docs/.
What I do is iterating over the version tags, checking them out, and regenerating the docs each time, so that the pages keep looking the same. It works okay so far, but there's a rather unpleasant hack in there 😞
In the Python script that's generating the docs, I happen to use my very same library for which I'm generating the docs, which means it needs to be installed.
Now it seems that the installed module always takes precedence over a local package directory.
When I say codefs
, I get this:
/Users/dAnjou/Projects/fs-code/.venv/lib/python3.10/site-packages/pdoc/extract.py:123: RuntimeWarning: 'codefs' may refer to either the installed Python module or the local file/directory with the same name. pdoc will document the installed module, prepend './' to force documentation of the local file/directory.
- Module location: /Users/dAnjou/Projects/fs-code/src/codefs/__init__.py
- Local file/directory: /private/var/folders/68/qf6k_9s17vd3yyh97mmy_mz40000gr/T/tmporitzu14HEAD/codefs
And when I then say ./codefs
, I get this:
/Users/dAnjou/Projects/fs-code/.venv/lib/python3.10/site-packages/pdoc/extract.py:145: RuntimeWarning: pdoc cannot load 'codefs' because a module with the same name is already imported in pdoc's Python process. pdoc will document the loaded module from /Users/dAnjou/Projects/fs-code/src/codefs/__init__.py instead.
So, to hack myself around this, I'm currently doing a poetry install
in my CI job, which includes an editable installation of my library. In the Python script, for each checked out version I'm replacing the current src
directory. Only then I'm running pdoc, because then it can use the installed module.
(Don't mind the subprocess stuff. That's still from when I was using pdoc 9 and using it as a library had some rough edges, which are all fixed now, so I'm in the process of migrating - also why I'm writing this comment here now.)
So, from what I can see, the core issue is that you basically cannot use pdoc as a library in a Python script that's supposed to generate docs for multiple versions of your code, if that code is also "imported in pdoc's Python process".
Imagine someone wanted to write a Python library that does exactly what OP is asking, as a pdoc add-on, for example. Ironically, you could not use pdoc to document multiple versions for this library, because it would always prefer the installed module.
@dAnjou: For your specific use case, it's probably easier to do the "check out another version" part in a bash script and then invoke your make-docs.py
script from there. Having multiple versions of the same module in a Python process is usually a recipe for disaster. :)
For your specific use case, it's probably easier to do the "check out another version" part in a bash script and then invoke your
make-docs.py
script from there.
I don't think it matters whether I check out the version in a bash script or in the Python script.
Having multiple versions of the same module in a Python process is usually a recipe for disaster.
Yes, that's the core problem, and I acknowledge that my situation might be an edge case, because I happen to use my library for checking out the version. But any situation where you want to use the latest version of your own code in the docs script would fail.
And it still makes me wonder whether there's a way for pdoc to extract code structure and doc strings without importing the module 🤔
I don't think it matters whether I check out the version in a bash script or in the Python script.
Well the idea would be that you have a fresh Python interpreter for each version.
But it still makes me wonder whether there's a way for pdoc to extract code structure and doc strings without importing the module 🤔
pdoc heavily relies on dynamic analysis (as opposed to static analysis), so the answer is a resounding no unfortunately.
FWIW pdoc.extract.invalidate_caches
may be an alternative here. In either case, my recommendation would be not to overcomplicate things. Render once into an S3 bucket and then be done with it. :)
Well the idea would be that you have a fresh Python interpreter for each version.
I can still do it in a Python script if I don't run it in an env that has my library installed, which I also cannot use then of course, but I can use Dulwich directly, for example.
pdoc heavily relies on dynamic analysis (as opposed to static analysis), so the answer is a resounding no unfortunately.
Totally understand that 👍
In either case, my recommendation would be not to overcomplicate things. Render once into an S3 bucket and then be done with it. :)
That's not an option for me. I want to be able to change the design and layout, and it should be applied to all versions already published.
After some messing around with Bash scripting, I managed to get multiple versions to work by using the using following steps, which I'm writing down in case it helps anyone:
master
branch, and stored this in an env variable VERSIONS
(space separated)redirectURL
and match
have a package
variable (also set from the env variable PACKAGE
) since the docs will always be generated for [USERNAME].github.io/[PACKAGE]/
, so those parts are GH-specific, and should be modified appropriately for other platforms (I am also assuming that the full path starts with /[PACKAGE]/[VERSION]/
, so that the drop-down reflects the currently selected version regardless of which sub-page the user is at):{% set versions = env.get("VERSIONS", "").strip().split(" ") %}
{% set package = env.get("PACKAGE", "test") %}
{% block nav_footer %}
<footer>
<label for="page-select">Version:</label>
<select id="page-select" onchange="redirectToPage()">
<option value="">Select an option</option>
{% for item in versions %}
<option value="{{ item }}">{{ item }}</option>
{% endfor %}
</select>
<script>
function redirectToPage() {
var select = document.getElementById("page-select");
var selectedOption = select.options[select.selectedIndex].value;
if (selectedOption !== "") {
var redirectURL = "/{{ package }}/" + selectedOption + "/index.html";
window.location.href = redirectURL;
}
}
// Set the default value of the dropdown to the selected option
window.onload = function() {
var select = document.getElementById("page-select");
var currentURL = window.location.pathname;
var match = currentURL.match(/^\/{{ package }}\/([^/]+)\/.+$/);
if (match && match[1]) {
select.value = match[1];
}
};
</script>
{% endblock %}
git reset && git checkout
only the source files, since this way, the script for building the docs is not affected, while it can still remain part of the repo - uses: actions/checkout@v2
with:
fetch-depth: 0
@JCGoran Thanks for the informative write-up on how to achieve this. Can you clarify what the directory structure needs to look like when building for multiple versions? I'm a bit unclear about what the output directory should be for pdoc after I checkout a particular tag in the CI script.
Can you clarify what the directory structure needs to look like when building for multiple versions?
Unfortunately I haven't touched the project using this particular setup in a while, but the general setup is described in this file, which I just run as bash generate_docs.sh -t
, which generates the docs for all of the git tags.
The rough idea is as follows:
master
) to the final listgit checkout
it, and build the docs under something like docs/[VERSION]
(note that this part is very destructive so I only run it fully in the CI where I can't nuke anything important)docs/index.html
which redirects to the latest stable version (so usually not master
) when loading the docs-t [DIRECTORY]
flag for this, where [DIRECTORY]
is the dir where the templates are located (don't remember if the name of the template file has to be exactly module.html.jinja2
)docs
dir (since it has all of the versions)You can see the final result here (the most notable difference w.r.t. the default pdoc template is the "Version" drop-down on the bottom of the left sidebar, which is clickable and redirects to the right version properly).
pdoc's handling of images was (is?) a bit cumbersome, and as a result the generate_docs.sh
script is a bit convoluted, but I am content with the results. Note that I haven't implemented caching in the CI, which means if you have many versions, it may be a bit slow to build.
Problem Description
Currently,
pdoc
only supports one version of a given Python package.Proposal
I would like to see (optional) support for multiple versions of a package in
pdoc
. It's possible that some users of a package do not use the version whose current documentation is available, and having the option to switch versions would greatly aid in usability, especially if some objects were removed/renamed between different package versions.Alternatives
I think Sphinx has support for this feature, but I have not tested it.
Additional context
None.