Open mgeier opened 3 months ago
@chrisjsewell You introduced the typed interface for that one so perhaps you could take that one?
you introduced the typed interface for that one so perhaps you could take that one?
Well, that's one of the reasons I introduced it; because for a long time I had no idea env_version
was even an option 😅
But indeed thank you for detailing this @mgeier, it would be good to make the documentation around it clearer
alternatively, a Domain could be used to store data in the environment.
I would note here, that I feel the purpose of a domain is more around handling referencing, and then the storage of data here is more of a "side-effect" of that; to store pointers to targets, etc I don't believe you should use it to store arbitrary data.
I don't believe you should use it to store arbitrary data.
This kinda says the opposite: https://github.com/sphinx-doc/sphinx/issues/9003#issuecomment-799084236
Would you suggest adding arbitrary attributes to app.env
instead?
Or do you have something even better?
And it's not only about storing data, it's also about making it work with parallel builds, which is actually hard, see e.g. https://github.com/sphinx-doc/sphinx/pull/12251.
And to be fancy, it's about removing the appropriate data if a source file is removed (which the "todo" extension does but the "recipe" extension doesn't, AFAICT).
For a concrete example, see https://github.com/mgeier/sphinx-last-updated-by-git/pull/66. I'm not sure if that's better or worse, BTW!
This kinda says the opposite
well that is not consistent with what the actual source says: https://github.com/sphinx-doc/sphinx/blob/6fd8b3004315f56b4464d9426829f7643e7b7bc1/sphinx/domains/__init__.py#L159-L180
And it's not only about storing data, it's also about making it work with parallel builds, which is actually hard
We deal with this in sphinx-needs with parallel builds for large documentation sets, and it works quite well: https://github.com/useblocks/sphinx-needs/blob/0f71ecde47a3f7c3245c5bcf545ef5281376768c/sphinx_needs/needs.py#L266-L270
But I agree that it would be nice to have a "consistent" workflow for storing data (outside of references)
that is not consistent with what the actual source says
Well, the docstring you are quoting is nearly 15 years old, so maybe the usage recommendations of this have changed slightly in the meantime?
Maybe the docstring should be updated? I have added this to the task list above.
Is there anything except that docstring that makes you think that it is not appropriate to store arbitrary data in a Domain
(as compared to adding arbitrary attributes to app.env
)?
We deal with this in sphinx-needs with parallel builds for large documentation sets, and it works quite well: https://github.com/useblocks/sphinx-needs/blob/0f71ecde47a3f7c3245c5bcf545ef5281376768c/sphinx_needs/needs.py#L266-L270
Thanks for this example, this is really interesting, since it seems to be a perfect fit for a Domain
!
The SphinxNeedsData
class looks like it could be perfectly derived from Domain
:
https://github.com/useblocks/sphinx-needs/blob/0f71ecde47a3f7c3245c5bcf545ef5281376768c/sphinx_needs/data.py#L399-L597
Now the merging and invalidation of data is scattered over different places, a Domain
would allow everything to be implemented in one place.
As an added benefit, this domain could then be used to namespace the (many) roles/directives to get something like needs:table
instead of needtable
, which would arguably a bit cleaner.
I'm of course not suggesting to do that now, because it would be too disruptive, but for a new project I would definitely consider it.
Ironically, the extension is not returning env_version
, which makes Sphinx believe that nothing is stored in app.env
, AFAIU: https://github.com/useblocks/sphinx-needs/blob/0f71ecde47a3f7c3245c5bcf545ef5281376768c/sphinx_needs/needs.py#L302-L306
Adding env_version
has already been suggested in https://github.com/useblocks/sphinx-needs/issues/852#issuecomment-1934707627
Oh I forgot there is something more in-line with what you want: https://github.com/sphinx-doc/sphinx/blob/5562e2b032820043ce75ee8645fe0d8e2bbf4d11/sphinx/environment/collectors/__init__.py#L14
and can be added with https://github.com/sphinx-doc/sphinx/blob/5562e2b032820043ce75ee8645fe0d8e2bbf4d11/sphinx/application.py#L1221
you see it just links methods to event hooks though, ao nothing you couldn't implement on your own
so maybe the usage recommendations of this have changed slightly in the meantime? Is there anything except that docstring that makes you think that it is not appropriate to store arbitrary data in a
Domain
The whole of the Domain
API is designed around storing and resolving references, this is the only thing it is used for in core sphinx, and there will likely be even more logic being added in this direction in the future: #8929
In addition, all roles are expected to be references to domain objects, and all directives are expected to create targets.
As I mentioned already, storing data is just a "side-effect" of reference resolution, not a primary goal of the Domain
.
If all you want to do is store data, then you are bringing in a whole lot of things you probably don't need;
for example you literally have to implement a concrete resolve_any_xref
method just to get it to work.
Adding env_version has already been suggested in https://github.com/useblocks/sphinx-needs/issues/852#issuecomment-1934707627
yep by me 😄
Just my 2 cents:
Would you suggest adding arbitrary attributes to app.env instead?
In my own extension, I have a single dictionary which is storing every information I need. It works a bit like the registry on MS-DOS systems where you have namespaces + key/values and any other "component" can put whatever they want in that registry using their own registry key (a bit like the Stash in pytest actually).
I create that dictionary as a single attribute in app.env
upon 'builder-inited'
and merge information in 'env-merge-info'
. I can easily access whatever information is being stored using something like app.env.registry[namespace][key]
.
does not set parallel_read_safe but I guess there is no reason for this extension to disable parallelization, right
Probably because we didn't want the tutorial to add some unknown logic.
... but it gives no indication how/why to do that.
There is actually a docstring about what the environment version is meant for:
# This is increased every time an environment attribute is added
# or changed to properly invalidate pickle files.
ENV_VERSION = 61
Describe the bug
https://www.sphinx-doc.org/en/master/extdev/index.html#extension-metadata documents
env_version
.[ ] It says:
... but it gives no indication how/why to do that. It could e.g. link to https://www.sphinx-doc.org/en/master/development/tutorials/todo.html
[ ] It says:
... but it doesn't say if that includes a
Domain
(https://www.sphinx-doc.org/en/master/extdev/domainapi.html) defined by the extension. I assume that the domain is somehow stored in the environment, but I guess it doesn't need to be reflected inenv_version
because it tracks its own structure with thedata_version
attribute?[x] https://www.sphinx-doc.org/en/master/development/tutorials/todo.html shows an extension that stores data in the environment, but it does not return
env_version
in itssetup()
function![ ] https://www.sphinx-doc.org/en/master/development/tutorials/todo.html could mention that alternatively, a
Domain
could be used to store data in the environment. It could link to https://www.sphinx-doc.org/en/master/development/tutorials/recipe.html, which provides an example for that.[x] https://www.sphinx-doc.org/en/master/development/tutorials/recipe.html does not mention
data_version
, which is of course not strictly necessary, but I think it might be a good idea to recommend specifying it, to make it easier to not forget bumping it once there are changes to the data at a later time.[x] This has nothing to do with
env_version
, but looking at the tutorials, I have seen that https://www.sphinx-doc.org/en/master/development/tutorials/autodoc_ext.html does not setparallel_read_safe
but I guess there is no reason for this extension to disable parallelization, right?[ ] The docstring of
Domain
doesn't give any information whether it is recommended to use it to store arbitrary data (see https://github.com/sphinx-doc/sphinx/issues/12237#issuecomment-2047118111)How to Reproduce
Look at https://www.sphinx-doc.org/en/master/extdev/index.html#extension-metadata