sphinx-doc / sphinx

The Sphinx documentation generator
https://www.sphinx-doc.org/
Other
6.41k stars 2.09k forks source link

Add static configuration (``Sphinx.toml``) #9040

Open choldgraf opened 3 years ago

choldgraf commented 3 years ago

Background

One of the challenges in getting started with Sphinx is the conf.py file, for a few reasons:

  1. It is written in Python, and so it is Python-specific, even if the person writing the documentation is using a different language.
  2. It is a fully-flexible Python script, which can be overwhelming for users not accustomed to it.

Over the years, many other configuration formats have arisen, probably the two most well-known are YAML and TOML. For example. Jupyter Book provides a layer of YAML configuration on top of Sphinx. Users have responded that this is a really friendly pattern for beginners and experts alike. I wonder if Sphinx would be interested in allowing for YAML or TOML configuration as well.

Describe the solution you'd like

In addition to the current config option of conf.py, add another option:

Allow config with YAML. I think it would be useful if Sphinx allowed for:

This file would be read in and converted to Python variables directly, as if it was written in Python (conf.py). So for example:

# In conf.yml
key: value
mylist:
  - item1
  - item2
mydict:
  dk1: one
  dk2: two

would map onto

# In conf.py
key = "value"
mylist = ["item1", "item2"]
mydict = {"dk1": "one", "dk2": "two"}

Allow conf.py to be provided simultaneously. Some Sphinx builds will still need to run custom Python code (e.g., to set up some extensions etc). In this case, authors may wish to keep their "simple config" in the YAML file, and the complex config in pure Python.

If conf.py is supplied as well as conf.yml, then the environment defined in conf.py will over-rule anything in conf.yml.

So the order of operations would be:

  1. (if it exists) Read in variables from conf.yaml
  2. Update with variables from conf.py if it exists, overwriting variables created in 1
  3. Everything else is the same...

Describe alternatives you've considered

I've tried creating a lightweight extension that allows this but didn't have success because of the way that extensions are activated.

I have also considered other documentation engines like mkdocs, which use YAML, but I'd for this to be in the Sphinx ecosystem!

cc some others who have discussed this in the executablebooks/ repo: @pradyunsg @ericholscher @chrisjsewell

EDIT: I've updated the above description to remove mention of TOML, as I don't want that to derail conversation here!

choldgraf commented 3 years ago

Towards @jpmckinney's point about needing evidence of demand, I think there are several examples of this already:

  1. this issue is already the 4th highest 👍 issue in Sphinx
  2. The perspective we are bringing from Jupyter Book is that users have found config.yml particularly useful and more accessible. We don't have quantitative survey data but this is certainly a signal we've heard. This is why I opened up this issue about upstreaming the feature in the first place.
  3. The readthedocs team is probably the single organization with the most exposure to "users of Sphinx" that I know of, so I tend to trust their instincts when it comes to what would provide good UX for Sphinx users.

More generally, I want to echo another point that @chrisjsewell has made - if you simply survey pre-existing Sphinx repositories, you are going to get a very biased sample. For example, it will not include the people who have chosen not to use Sphinx because it felt too Python-specific, or complex, or developer-focused. This issue is about reaching those people and I think it's worth disentangling this from "pre-existing Sphinx repositories". For those repositories, conf.py works fine I assume (they are using it after all). I want to know what would be useful for people that find Sphinx inaccessible (for example, why would someone choose MkDocs over Sphinx?), and my intuition is that YAML-based configuration is a big blocker for folks.

jpmckinney commented 3 years ago

To clarify, I'm suggesting that we find evidence (in two steps above) that we will not create pain for users - specifically, the pain of starting with a static file and then realizing that they need to use a Python file. This is a feasible research task.

We already know there is evidence of demand. But demand is not the single deciding factor. I do not know a popular open source project that operates in that way. That is why we are also discussing these other issues.

I'd be interested to hear more from the RTD team in terms of what they hear from users. So far, their comments have been mainly that a static file will make some of their backend wizardry easier (e.g. not having to inject code into conf.py files) https://github.com/sphinx-doc/sphinx/issues/9040#issuecomment-820569013. This is a developer need, not a user need.

I think we are all on the same page about reaching new users, but let's not forget (as @astrojuanlu put it) that "we need to do this without alienating or confusing existing and new users".

choldgraf commented 3 years ago

specifically, the pain of starting with a static file and then realizing that they need to use a Python file

Could I ask for clarification on what you think the choices are here?

Say that we find that most repositories that use Sphinx involve some Python-specific code in conf.py. Is this a sign that sphinx.yaml shouldn't exist at all? Or a sign that sphinx.yaml should co-exist with conf.py?

I am trying to understand if we are still debating whether YAML configuration should be allowed at all, or whether we are debating if it should co-exist with conf.py or not.

jpmckinney commented 3 years ago

Could I ask for clarification on what you think the choices are here?

I think coexistence has a lot of problems, as outlined earlier. don't think any current participants in this issue are deeply invested in coexistence per https://github.com/sphinx-doc/sphinx/issues/9040#issuecomment-835856482.

I think non-coexistence could be an option, depending on what we discover. From my perspective, there is no value in pure "debate" without more facts. I suggested some simple fact-finding activities.

I personally think most Python-specific code could be made into configuration variables, thus allowing YAML, but I also personally don't feel comfortable advocating a change to Sphinx without more facts.

If, counter expectations, we find that there is a great need for Python-specific code, then we do have grounds for debate, since neither option (YAML and no-YAML) will be without issues. But we are not there yet, and it is fruitless to start there, as you will just create two sides whose foundations aren't evidence but conviction.

choldgraf commented 3 years ago

So is this a correct summary of your thinking?

jpmckinney commented 3 years ago

So is this a correct summary of your thinking?

Yes. With two clarifications:

  1. If we find that Python-specific config is necessary, we can hopefully add ways to perform that config statically (for example, path_additions as above), in which case the YAML option is still open.
  2. The thinking does not need to be boolean. For example, the magnitude of the demand and the magnitude of the Python-specific configurations can be relevant. We have at least a rough sense of the former, so let's work on the latter.
astrojuanlu commented 3 years ago

Measuring "how much dynamic Python config is used" is trivial:

In [1]: import ast

In [2]: from collections import Counter

In [3]: with open("examples/poliastro/conf.py") as fp:
   ...:     contents_poliastro = fp.read()
   ...: 

In [4]: tree1 = ast.parse(contents_poliastro)

In [5]: Counter([node.__class__.__name__ for node in tree1.body])
Out[5]: Counter({'Import': 2, 'Try': 1, 'Assign': 38, 'If': 1, 'Expr': 3})

If we are going this route though, I very much think we should define the goals and thresholds upfront. Otherwise, how much is "too much" or "too little"? 80 %? 90 %? Will we weigh the projects according to popularity? Is a totally random sample good enough? What kind of biases would invalidate the analysis?

I'm not asking these questions to derail the conversation, quite the opposite: I don't want to spend ~1 week analyzing a random sample of Sphinx projects, and then have people saying "but they are mainly scientific!", "but we don't know how large or small or new or old are!", "but RTD projects are biased!", etc.

And finally, while I stand 100 % by this (that, if we want to seek facts, we need to set expectations upfront), I think it's way more work to agree on a way to do this analysis in a scientifically correct way, than to agree on whether we want coexistence or not, implement the feature accordingly, and pay attention to the bug reports in this issue tracker.

I'd be interested to hear more from the RTD team in terms of what they hear from users.

The users we have talked to so far mainly complain that they prefer Markdown over reStructuredText. If we are sincere about listening to users, we might achieve "uncomfortable conclusions" 😄 (this is a light-hearted comment, I'm well aware of the tradeoffs of each tool and I am not interested in flame wars)

astrojuanlu commented 3 years ago

(Sorry if my last comment reads overly pedantic or frustrated - I should have re-read it before posting it. There are many options and possible paths open and, while I wanted to be constructive, I added lots of unhelpful context. In summary: +1 to do user research in a proper way, +2 on being pragmatic and "just implement the feature" in a way that it helps non-Python users be more confident with Sphinx, -1 on letting this conversation stagnate as a result of raising the bar too much for such an understaffed project, because I think this feature would be valuable for lots of people)

jpmckinney commented 3 years ago

I understand the concern that some people might start arguing in all the ways you describe. For me, even just a quick analysis in which the limits are described would advance the discussion. Like:

I took a sample of ### projects that:

  • do not use recommonmark (since that often requires Python to call add_config_value and add_transform(AutoStructify), and recommonmark is now deprecated)
  • use a recent version of Sphinx / sphinx-rtd-theme (since otherwise Python is required to call sphinx_rtd_theme.get_html_theme_path - add_html_theme was added in Sphinx 1.6, and it took time for themes and users to adopt it)

In that sample, x% were plain assignments (i.e. no Python-specific features). A quick look at the others suggest that we should add a way to configure XYZ (e.g. the Python path).

That's just an example. The limitations and depth of analysis are up to whoever does the work. Earlier I suggested limiting to projects with projects by "Programming Language: Only Words", for instance.

jpmckinney commented 3 years ago

Anyhow, if we skip ahead and just implement a static file, then we should:

Between TOML and YAML, TOML is a little nicer for the ecosystem, because pretty much ALL themes and extensions say something like "Add html_theme = 'mytheme' to conf.py", and a simple copy-paste will still work in TOML, but it will not work in YAML, because the user has to know to change = to :.

Update: To clarify, I suspect that these changes would be made over multiple versions, e.g. introduce sphinx.yaml in one version, and then make it the default in a later version (updating the corresponding documentation, etc.).

chrisjsewell commented 3 years ago

Anyhow, if we skip ahead and just implement a static file, then we should

IMO none of this will ever happen, because it breaks with the status quo too much, and there will always be someone to object to one aspect or another, especially since there is no strong advocate from the maintainers (which is absolutely their prerogative) Happy to be proved wrong, but I'm deeming this issue DOA

ericholscher commented 3 years ago

I'd like to argue that we should merge the existing PR and get this feature into sphinx in the next release.

I appreciate your feedback @jpmckinney, but it feels like you're mostly just blocking progress here without suggesting realistic goals. You're asking folks to do huge amounts of research, along with changing fundamental parts of Sphinx with this change. This isn't a realistic view of the time available to do development on this project -- we should remove the inability to use a YAML file for config, and we can discuss making it a default after we have more data.

Similarly, doing all this research is mostly noise. The core reason to add a static config is to expand the userbase of Sphinx. Doing research on existing users is not meaningful. The users who are primarily addressed with this change by definition didn't use Sphinx.

I'd like to appeal to @tk0miya for a 👍 or 👎 on this issue, by either closing it or merging. We have discussed it at length, and major contributors in the Sphinx ecosystem are for it (The RTD team, the MyST & Jupyterbook team, and other contributors). I feel that by extending this discussion, we are effectively blocking with extended discussion instead of actually providing a better feature for users, so I'd suggest that this is the point where we should make a decision.

jpmckinney commented 3 years ago

I'm happy with whatever decision the maintainers make.

If the goal is to expand the userbase, I don't see how that goal is achieved if we leave the default as conf.py, if the static file is an essentially undocumented feature (the PR just says that sphinx.yaml is possible without even an example), and if sphinx-quickstart isn't aware of sphinx.yaml as an option. Can anyone explain? To be clear, we don't need to do all these in the PR (e.g. changing the default might wait for a later version).

@ericholscher: I don't see much repetition in this thread. As far as my position goes, I started off suggesting @tk0miya's idea in https://github.com/sphinx-doc/sphinx/issues/9040#issuecomment-820500990 was the way forward. I then suggested that non-coexistence could also work but that we should (1) add something like the path_additions and version_object that you suggested in https://github.com/sphinx-doc/sphinx/issues/9040#issuecomment-815105505 and (2) check whether we might be creating problems for users. And I most recently suggested that we could skip (2), but that we might want to consider a few other things (some might go in the PR, and some might wait for a later version, like changing the default). Some of those would address the concerns that one maintainer (@jakobandersen) expressed in https://github.com/sphinx-doc/sphinx/issues/9040#issuecomment-835461828.

ericholscher commented 3 years ago

If the goal is to expand the userbase, I don't see how that goal is achieved if we leave the default as conf.py, if the static file is an essentially undocumented feature (the PR just says that sphinx.yaml is possible without even an example), and if sphinx-quickstart isn't aware of sphinx.yaml as an option. Can anyone explain? To be clear, we don't need to do all these in the PR (e.g. changing the default might wait for a later version).

This is mostly the point. We don't want to invest a lot of work into supporting sphinx.yaml until we know that it's been accepted. We started and then halted support for it in RTD because of the indecision in this thread. Once the community knows it's possible, we will work to improve the docs, support it on RTD, write it into tutorials, etc. Until the inability to use it has been removed from the codebase, all that work can't start. We're arguing for a :+1: on the vision of this feature, adding the basic ability to the codebase, and then the downstream work can begin.

jpmckinney commented 3 years ago

Sounds fine to me!

The last things that @tk0miya wrote (https://github.com/sphinx-doc/sphinx/issues/9040#issuecomment-835426614) were:

Indeed, I and shimizukawa vote to disallow the co-existence. So there is no additional step. But it seems everyone says "v1", or "this is the first step of static configuration". Does it mean we have the next step? I think it's an important point.

The steps in my comment at https://github.com/sphinx-doc/sphinx/issues/9040#issuecomment-858896396 list some possible next steps, along with the RTD support that @ericholscher mentioned.

Additonally, @jakobandersen; a maintainer of Sphinx votes they can co-exist. As I saw his reaction in #9170, I guess he thinks it should not be merged if we choose the "not co-existent" configuration (Is this correct? Please let me know your opinion). I think the discussion is not agreed yet. So we have to discuss more.

The comment in #9170 is https://github.com/sphinx-doc/sphinx/pull/9170#discussion_r627647344. So, it seems the maintainers must discuss.

LecrisUT commented 1 year ago

So what's the status on this? What's blocking it?

astrojuanlu commented 1 year ago

What's blocking it is reaching consensus about a way forward.

To repeat what was summarized in the last comment a few pixels above, some Sphinx maintainers think having 2 methods is a bad idea and adds complexity, so there should be a hard transition. Some others think having 2 methods could pave a way to a smoother transition. These two visions cannot coexist, and until consensus is reached, the status quo will be maintained.

LecrisUT commented 1 year ago

I mean it's been 2 years and a few major releases in between. There is a simple solution to proritize the yaml and if a include item is passed, then include it as either additional yaml or conf.py format.

Much of the functionalities that needed to be dynamic seem to be handled by the plugins side. Otherwise, that can be handled by jinja templating it (maybe with values from pyproject.toml or importlib.metadata) or adding a few extra plugings, e.g. for the dynamic version.

chrisjsewell commented 1 year ago

Well that and yaml vs toml. Probably at this point, with toml in core python, it might make more sense

astrojuanlu commented 1 year ago

I mean, I don't think there's a "simple" solution (otherwise we probably wouldn't be having this conversation).

If you ask me, I do agree with @tk0miya and others that we should obliterate dynamic config completely, make it static (in whatever format we decide, please don't bring YAML vs TOML debates again), and any dynamic config should be handled by plugins/extensions/hooks/entry_points/whatever. The whole Python ecosystem is solidly moving in that direction.

LecrisUT commented 1 year ago

"Simple" I mean it is straightforward to implement on top of #9170. I don't think either camp would be objecting to this, and it is just 5-ish extra lines of code. But also the ecosystem and people's experience change, so it is worth testing the waters again every now-and-then.

About the yaml-toml. Most tools use a .sphinx.yaml format and are common outside python environments. But it's not like both cannot be included

abhiaagarwal commented 1 year ago

FYI, for anyone who's looking to at least have DRY for sphinx as a temporary solution, take a look at sphinx-toolbox/sphinx-pyproject by @domdfcoding so you can just make the conf.py a wrapper around pyproject.toml

AA-Turner commented 1 year ago

Static configuration is a good idea. I think we can move incrementally, a first version to be committed would be if Sphinx.toml exists in confdir, use it as the only source. People needing more complex configuration (dynamic X/Y/Z) can still use conf.py, which isn't going away. In time, we might see how settings from Sphinx.toml and conf.py can be mixed, but for now I think let's use the exclusivity approach.

A

jeanas commented 1 year ago

I was pretty enthusiastic for TOML, especially for pyproject.toml integration (e.g., the version could by default be read from the standard packaging metadata) and started writing a patch... but I hit a snag: TOML doesn't support an equivalent of None. There are some configuration variables that use None in their format. If Sphinx had started with TOML config from day 1, all existing confvars would be designed with this in mind (e.g., not (x, y) but {"x": x, "y": y} so that (None, y) can be expressed as {"y": y}), but it takes some work to migrate existing confvals...

So, with disappointment, I have to say that this pretty much kills TOML in favor of YAML.

electric-coder commented 1 year ago

Static configuration is a good idea.

I don't think this is a priority, a lot of users are going to want conf.py if only for simple things like pulling their library version dynamically from pkg_resources for example, or having a custom extension to pprint a collection literal.

I think it'd be better for contributors to be focused on existing bugs that likely affect the majority of current Sphinx users than wasting energy on new features like .toml that won't solve anything for the userbase that's already relying on a dynamic conf.py.

Viicos commented 7 months ago

Considering the drawbacks of using static configuration and the lack of null values in TOML, would you consider an option to define a helper for static typing? Something like:

conf.py

from sphinx.somewhere import SphinxConf

conf = SphinxConf(
    project=...,
    ...
)

As an optional alternative of course.