sphinx-doc / sphinx

The Sphinx documentation generator
https://www.sphinx-doc.org/
Other
6.36k stars 2.08k forks source link

Add static configuration (``Sphinx.toml``) #9040

Open choldgraf opened 3 years ago

choldgraf commented 3 years ago

Background

One of the challenges in getting started with Sphinx is the conf.py file, for a few reasons:

  1. It is written in Python, and so it is Python-specific, even if the person writing the documentation is using a different language.
  2. It is a fully-flexible Python script, which can be overwhelming for users not accustomed to it.

Over the years, many other configuration formats have arisen, probably the two most well-known are YAML and TOML. For example. Jupyter Book provides a layer of YAML configuration on top of Sphinx. Users have responded that this is a really friendly pattern for beginners and experts alike. I wonder if Sphinx would be interested in allowing for YAML or TOML configuration as well.

Describe the solution you'd like

In addition to the current config option of conf.py, add another option:

Allow config with YAML. I think it would be useful if Sphinx allowed for:

This file would be read in and converted to Python variables directly, as if it was written in Python (conf.py). So for example:

# In conf.yml
key: value
mylist:
  - item1
  - item2
mydict:
  dk1: one
  dk2: two

would map onto

# In conf.py
key = "value"
mylist = ["item1", "item2"]
mydict = {"dk1": "one", "dk2": "two"}

Allow conf.py to be provided simultaneously. Some Sphinx builds will still need to run custom Python code (e.g., to set up some extensions etc). In this case, authors may wish to keep their "simple config" in the YAML file, and the complex config in pure Python.

If conf.py is supplied as well as conf.yml, then the environment defined in conf.py will over-rule anything in conf.yml.

So the order of operations would be:

  1. (if it exists) Read in variables from conf.yaml
  2. Update with variables from conf.py if it exists, overwriting variables created in 1
  3. Everything else is the same...

Describe alternatives you've considered

I've tried creating a lightweight extension that allows this but didn't have success because of the way that extensions are activated.

I have also considered other documentation engines like mkdocs, which use YAML, but I'd for this to be in the Sphinx ecosystem!

cc some others who have discussed this in the executablebooks/ repo: @pradyunsg @ericholscher @chrisjsewell

EDIT: I've updated the above description to remove mention of TOML, as I don't want that to derail conversation here!

pradyunsg commented 3 years ago

using pyproject.toml does not prevent to use of multiple configurations. If they want to do that, they can use multiple configuration files like sphinx.yaml (or sphinx.toml) instead. Using pyproject.toml is not required. Additionally, almost users uses only one configurations, I guess (not researched actually).

Fair enough. Consider me convinced that this is worthwhile. :)

FWIW, if we do go the route of having Sphinx configuration in pyproject.toml, having a sphinx.toml also avoids having multiple formats for the same thing.

chrisjsewell commented 3 years ago

oi @pradyunsg

Anyway, the reason I kept my thoughts on file format choices in hidden-unless-you're-curious is because I didn't want push this issue toward that way. I should've just omitted that whole thing.

but then goes on to mention it in every comment. We get it, you are for TOML 😜 πŸ˜† but we've agreed to put a pin in that discussion for now

pradyunsg commented 3 years ago

Hehe, I thought it was only pinned until folks agreed that static metadata is useful. :P


I do like what https://github.com/sphinx-doc/sphinx/issues/9040#issuecomment-818890554 suggests. It avoids the need for adding a (new?) extension.py file, by just reusing conf.py. The only thing I'd say here is that we should be strict by default, since both the files are provided by the user, and providing an informative error message is straightforward here.

Also, +1 for (very limited) dynamic behaviour in the static metadata file. The exact syntax of it would likely depend on whether we pick TOML/YAML here. That said, I'd suggest designing the dynamic behaviour as:

# static value
version = "1.0.0"
# dynamic value
version = {
  "file": "../__init__.py",
  "regex": '__version__ ="(.+)"'
}

Using a dictionary to substitute the original string will:

chrisjsewell commented 3 years ago

I do like what https://github.com/sphinx-doc/sphinx/issues/9040#issuecomment-818890554 suggests. It avoids the need for adding a (new?) extension.py file, by just reusing conf.py.

yes but @tk0miya has repeatedly said that he does not want sphinx.yaml and conf.py to co-exist, and he's the "boss". So I'm not sure how helpful it is to keep ignoring this fact and (re-)suggesting it

chrisjsewell commented 3 years ago

I'd suggest designing the dynamic behaviour as... a dictionary to substitute the original string

yeh this makes. How would we envisage processing this? Would it be limited to certain configuration keys? or for any config with a str default? Perhaps for add_config_value you could add a keyword like allow_regex or something

pradyunsg commented 3 years ago

Hmm... While I agree that @tk0miya will make the final calls here, they did say about the 2-files-together that:

I don't understand the worth of it.

I feel like the linked comment goes into explaining how it'd work and why that's useful (as does my comment from ~12 days ago), so I wanted to +1 that as well.

(I just re-read the whole thread, and they haven't explicitly provided a line of reasoning beyond that for their preference as far as I can't tell)

That said, I don't feel strongly about this and, honestly, either way works for me. We all agree that it's useful to have per-project Python code to be able to do custom things.

pradyunsg commented 3 years ago

Would it be limited to certain configuration keys?

Yeap.

Perhaps for add_config_value you could add a keyword like allow_regex or something

Ooo. Nice question! I'm imagining a new "dynamic_processors" argument, that takes a Dict[str, Callable[ [Dict[str, str]], str]. If the value is a dictionary instead of whatever type it is supposed to be, then the keys of the user-specified dictionary would determine which the dynamic processor would get called:

def regex_handler(value: Dict[str, str]) -> str:
     assert "regex" in value  # value is the whole mapping
     # check that the user provided file, error out with a useful message otherwise
     # look for the value in the file, i.e. the dynamic behaviour which can fail
     return "1.0.0"  # assume that this got the values by reading things

def setup(app):
    app.add_config_value("version", dynamic_processors={"regex": regex_handler})

This does need some additional complexity in the configuration loading logic, but I think that'd be equivalent to what's needed for parsing special string values as well. Anyway, I guess that's what I'd imagine it looks like to the extensions and code outside of the Config/Sphinx classes. :)

PS: yea, I don't know what might be a better name than dynamic_processors but that's the best that I could come up with.

domdfcoding commented 3 years ago

I also tried converting a Jupyter Book configuration from YAML into TOML, and found that it is not possible because TOML cannot have a dictionary inside a list. (at least, in the Python toml package, this fails for me: dumps(["one", {"two": "three"}]))

Reading the TOML specs (v0.5.0 and v1.0.0) it looks like the ability to have mixed arrays was added in v1. The Python toml package only implements v0.5.0

ericholscher commented 3 years ago

I'm πŸ‘ on holding off the YAML/TOML discussion. I think the more important discussion is what to include in the files and the architecture of the implementation.

As we talk more and more about building Python-specific stuff into the config. I'd like to step back and see if we all agree on the best path forward here. I thought @chrisjsewell made a great point above:

A key point to me, is that the basic configuration is essentially in no way associated with Python.

I'd strongly support this vision. Lots of people use Sphinx and don't want to know or care about Python more than needed. This leads to a few outcomes for me:

I think @jakobandersen's comment above shows good path forward here. Basically, sphinx.yml sits "on top of" conf.py, and instead of adding more complexity of another python integration point (extension.py, magic static variables), we just fallback to conf.pyfor dynamic configuration. This feels right to me, and the easiest to understand/explain/implement. I already can imagine the confusion with users around "why would I use extension.py instead of conf.py, etc.".

I think the "v1 of static config" we ship should just do static configuration as an option for Sphinx. Once we start using it more in practice, I think we can add additional dynamic elements to it if needed, but as we've discussed it, they all sound like magic to me (even the ones I suggested above :D). Why implement more magic when we can just migrate static configuration into the sphinx.yml, and keep the dynamic and magical things in the conf.py as a first step?

Once we have more real-world feedback and experience with this system, we will be able to create a much better design than trying to figure it in this GitHub issue with 0 production experience with an actual implementation.

tk0miya commented 3 years ago

yes but @tk0miya has repeatedly said that he does not want sphinx.yaml and conf.py to co-exist, and he's the "boss". So I'm not sure how helpful it is to keep ignoring this fact and (re-)suggesting it

The most important thing is I'm not the "boss" :-p I am one of the maintainers of Sphinx, and don't have special roles. Indeed, I must admit I'm the most active maintainer. But there is no special role in the Sphinx team, I think.

Hmm... While I agree that @tk0miya will make the final calls here, they did say about the 2-files-together that:

I don't understand the worth of it.

Yes. I've surely objected to supporting two files at the same time. But, it's one of the opinions in this discussion. Not determined yet. And I'll change my mind if I'll understand the merit of the combination use.

In my thought, the new configuration format mainly helps users who are not familiar with Python to using Sphinx. So far, I've seen questions about the syntax of Python for conf.py sometimes. On the other hand, I don't think Python is not good for configuration. For the basic usage, users will use the basic syntax of Python; assignment, literals (string, boolean, number, list and dict). I believe it's not difficult for the programmers because other languages also have similar things. From this perspective, there is no advantage to use a static one for the python developers. So I still don't understand why you'd like to migrate to static file. Anything I'm missing?

Additonally, I think we can add the hybrid configurations (static and dynamic) at the later step. But we'll not be able to remove it after the release even if something bad. So I think it's better to implement single static file at first.

As a just idea: Is it helpful to add a hook (or replace) configuration loading phase to implement this as an example extension?

chrisjsewell commented 3 years ago

I am one of the maintainers of Sphinx, and don't have special roles

Well thats what I jovially meant by the "boss" πŸ˜‰; you have and the other maintainers are the only ones can approve a PR. But sure, if you are undecided, then I will duck out if the discussion for now and let others fight it out, because I'm not really bothered either way, but think it needs to be decided before we can move forward with a solution.

Is it helpful to add a hook (or replace) configuration loading phase

In principle yes then everyone can do what they want. Although (a) how would one inject this hook, given extensions are loaded by the configuration, (b) this could lead to fractured implementations, that I imagine would also be difficult for e.g. RTD

ericholscher commented 3 years ago

In my thought, the new configuration format mainly helps users who are not familiar with Python to using Sphinx. So far, I've seen questions about the syntax of Python for conf.py sometimes. On the other hand, I don't think Python is not good for configuration. For the basic usage, users will use the basic syntax of Python; assignment, literals (string, boolean, number, list and dict). I believe it's not difficult for the programmers because other languages also have similar things. From this perspective, there is no advantage to use a static one for the python developers. So I still don't understand why you'd like to migrate to static file. Anything I'm missing?

@tk0miya The other major benefit to using something that isn't executable is that it's parsable by computers, safely.

In RTD, you can see the outcomes of this:

It's more complicated for us, and other people who want to read values from the config in any way (even in Python!) To summarize:

Overall, having a configuration that isn't dynamically executed code is a major benefit to many users of Sphinx, especially people who are extending or building on top of the toolchain.

Additonally, I think we can add the hybrid configurations (static and dynamic) at the later step. But we'll not be able to remove it after the release even if something bad. So I think it's better to implement single static file at first.

I'm fine with this to start πŸ‘ I think a v1 can be "either a static or a dynamic config". Users can always add extensions: my_extension to do many dynamic things in the static config.

tk0miya commented 3 years ago

@ericholscher Thank you for the explanation. Indeed, it must be an advantage. But the advantage will be vanished if users can use both static and dynamic configurations, right? The value of static file might not be used because it can be overridden in the dynamic one:

# config.yaml
language: ja
# conf.py
language = 'de'  # Sphinx adopts this value (if we support the hybrid configuration)
tk0miya commented 3 years ago

I noticed we need to consider the way to handle container configuration values:

# config.yaml
extensions:
- sphinx.ext.autodoc
- sphinx.ext.autosummary
# conf.py
extensions = [
  'sphinx.ext.todo',
  'sphinx.ext.graphviz',
]

The extensions will be overridden, not merged. Is this an expected result?

choldgraf commented 3 years ago

Unless anyone sees a major reason to do more complex "merging" of the two types of configuration, I think that for a v1, we should just say "anything in conf.py will clobber anything in config.yaml". As a first step, cleanly separating the functionality seems reasonable.

Once we learn more about how these configurations are used, what people want of them, etc, I think it'll be easier to take baby steps towards loosening the constraint that they are strictly separated. Does that make sense to others?

One idea - let's say somebody wanted to use static config, but do some extra python stuff as part of their build. Could we document the way to really quickly write a simple extension that would be added in extensions = [ and that would do that execution? That should be a way to effectively do what many do in conf.py right now.

Or put another way - I am fine with telling users 'If you just wanna configure Sphinx, you shouldn't ever have to see Python, but if you want to do "advanced things" with Sphinx, you'll need to be comfortable writing a small Sphinx extension in Python'

pradyunsg commented 3 years ago

I actually advocate for erroring out in that case. It's not that much complexity to detect these cases, and error out with a helpful message:

config_from_static_file = {...: ...}
config_from_conf_file = {...: ...}

overlapping_keys = set(config_from_static_file) & set(config_from_conf_file)
if overlapping_keys:
    raise ConfigError(
        f"You declared {overlapping_keys} in both files. Don't do that."
    )

IMO, this does tie a clean knot on the whole "what do we do in this case" issue. We can relax this later, so we'd also leave the door open for allowing cascading or clobbering behaviors in the future.

pradyunsg commented 3 years ago

Part of why is that these users are adopting a new $thing, so having the new $thing enforce additional constraints is not a backward compatibility concern. We should make it possible for us to change our minds based on feedback; and a good way to do that is to avoid adding functionality that we might want to remove later.

I think being exceedingly strict in "v1" for overlapping configurations is much better than being lenient, since it's easier to allow previously invalid configurations (strict -> lenient) instead breaking configuration that "works" for a user (lenient -> strict).

chrisjsewell commented 3 years ago

Ok I've implemented "v1" in #9170: the addition of sphinx.yaml, that is not allowed to co-exist with conf.py, and is read statically (i.e. with yaml.safe_read) . Show of hands, to get this merged πŸ™‹β€β™‚οΈ

bjones1 commented 3 years ago

@chrisjsewell, would you consider parsing with strictyaml instead? Especially for new users, the ability to parse with a schema helps avoid a lot of the confusion / frustration from YAML's edge cases.

chrisjsewell commented 3 years ago

Err, I've never used it before, so I really don't know, and obviously it is introducing new dependencies to sphinx, which isn't ideal (pyyaml was already an indirect dependency). I'm not necessarily against it, but I guess I would defer to @tk0miya

@pradyunsg I'm sure you will love this read πŸ˜† https://hitchdev.com/strictyaml/why-not/toml/

pradyunsg commented 3 years ago

Yea, I've read that before. The author is basically trying to bash TOML in the documentation to argue that strictyaml is the best and has no flaws at all. While also using TOML for something it's not designed for, to show strictyaml in the best light? And there's random references to general programming practices too. πŸ€·πŸ»β€β™‚οΈ

I assume that tone is because the author felt like TOML is "threatening" in some form. ;)

would you consider parsing with strictyaml instead

Unless we're using a file extension other than .yaml, I'm a -1 to this personally.

bjones1 commented 3 years ago

Thanks for your thoughts. My interest in strictyaml comes mainly from a desire to make whatever format is chosen as user-friendly as possible, especially for new users / adopters. Providing not simply a format (TOML / YAML / conf.py / etc.) but a schema helps do this. For example, a schema can:

Here's a short example of a YAML file with these problems:

# Oops, this will be read as a float instead of a string.
# Using "version: 0.1.0" makes it a string again -- a bit confusing.
version: 0.1
# Pilot error: this should be a list of strings.
exclude_patterns: library/xml
# Still accept this as a bool, even though it's presented as a string.
nitpicky: "False"

Here's the results:

After making exclude_patterns a list, strictyaml produces {'version': '0.1', 'exclude_patterns': ['library/xml'], 'nitpicky': False} -- correct types everywhere.

Test code

import yaml
import strictyaml
from strictyaml import Map, Seq, Str, Bool

yaml_string = """
# Oops, this will be read as a float instead of a string.
# Using "version: 0.1.0" makes it a string again -- a bit confusing.
version: 0.1
# Pilot error: this should be a list of strings.
exclude_patterns: library/xml
# Still accept this as a bool, even though it's presented as a string.
nitpicky: "False"
"""

schema = Map(dict(
    version=Str(),
    exclude_patterns=Seq(Str()),
    nitpicky=Bool(),
))
print(yaml.safe_load(yaml_string))
print(strictyaml.load(yaml_string, schema).data)
pradyunsg commented 3 years ago

I'm sorry, but are you advocating for schemas or for strictyaml?

Schemas aren't special to strictyaml in any way. There's lots of tooling (like glom, jsonschema etc) if you want to provide user-friendly messaging on validation errors. Sphinx already stores an in-memory schema to validate user inputs.

Sphinx extensions contribute configuration and the only way to account for that is to have that be generated dynamically, at which point you lose the biggest class of benefits for schemas: tooling integration (such as autocomplete). And the only thing it leaves you with is automated validation. That's already available to Sphinx's configuration loader and validated by it today.

bjones1 commented 3 years ago

That's a good point; thanks. As long as we're doing good validation / type conversion, that should help.

tk0miya commented 3 years ago

We need to load extensions to know what configuration variables are defined. But the list of extensions is defined inside the configuration file. It's like a chicken-egg question.

hukkin commented 3 years ago

This thread is rather long so I thought I'd try to do my best at wrapping things up :smile:

@chrisjsewell made an excellent PR that I'd love to see merged. Apparently merging is blocked by the lack of consensus in this thread. So I thought I'd walk us all through the controversial parts:

  1. The Great Format War: YAML vs TOML\ I guarantee that even if this thread stays open for 15 years and 2051 more comments there will not be a consensus on this. :smile: IMO the maintainers will have to pick a format at this point. All necessary info, pros and cons of each format are already somewhere in this thread. Honestly this is bikeshedding, and even if the "inferior" format (namely YAML, lol) is chosen, it's outshadowd by the fact that we finally have a standardized static configuration format.

  2. Co-existence of dynamic and static conf\ Maintainers @tk0miya and @shimizukawa have expressed that they would not want to allow co-existence of dynamic and static conf at this point. @chrisjsewell's PR follows their advice. I think that this is the obvious way to begin with: transition to allowing co-existence and value overriding later is a non-breaking change, and can be discussed later to keep the scope of this issue and PR manageable.

  3. Whether or not to have dynamic behavior in static configuration file\ @chrisjsewell's PR doesn't introduce any dynamic behavior. This IMO is great at this point. Adding this later if necessary is a lot less breaking than removing it. Another decision that we can scope out to keep this issue manageable.

So IMO, @chrisjsewell's PR makes uncontoversial choices when it comes to points 2. and 3. deferring the additional features until later while leaving the migration path non-breaking.

The only true PR blocking action item in this thread to me seems to be point 1. i.e. the format war, where I feel all sides have been heard, and maintainers will have to pick a format.

bjones1 commented 3 years ago

+1 for merging @chrisjsewell's PR.

pradyunsg commented 3 years ago

I think I'll note that:

Beyond that, I don't have more energy to spend on this discussion and hence am unsubscribing. Thanks for the discussion folks! :)

hukkin commented 3 years ago

I dislike the choice of not allowing both sphinx.toml and conf.py files to co-exist without overlap

@pradyunsg I agree with you. Maybe you can also agree that @chrisjsewell's PR doesn't rule out adding this later, and that the addition would be non-breaking. And that it's nice to get the first iteration merged. And that it's painful to have a discussion where every comment consists of three bullet points :smile:

ericholscher commented 3 years ago

I'm πŸ‘ on merging @chrisjsewell's PR as a starting point, once it has docs :)

Maybe you can also agree that @chrisjsewell's PR doesn't rule out adding this later

This is part of my view. Let's get something working and build on it in a way that doesn't foreclose further options.

Especially if users can define extensions in the sphinx.yaml, that should at least allow people to hack things together (eg. exec(conf.py) in the crudest case).

choldgraf commented 3 years ago

I'm also +1 on merging the PR

On YAML vs TOML I feel like ubiquity of format weighed against "better format for config" kinda evens out, and YAML feels like it'll be the "least surprising" option for users, so i think it's a reasonable choice w/ pros and cons on both sides.

I want both config files to coexist as well but don't believe that we should block iterative progress on that sticking point. Better to take steps forward and reflect+iterate once we've used the new config option. I don't believe that @chrisjsewell's PR closes any important doors so I am πŸ‘

mmcky commented 3 years ago

@hukkinj1 what a great concluding summary!

While mainly an observer of this thread, I am also +1 on merging @chrisjsewell PR.

chrisjsewell commented 3 years ago

I'm πŸ‘ on merging @chrisjsewell's PR as a starting point, once it has docs :)

@ericholscher I was sneakily planning to do this after merging, in case I was wasting time on a rejected PR πŸ˜„

Whats the minimal you would like adding?

Maybe at the end of https://www.sphinx-doc.org/en/master/usage/quickstart.html#basic-configuration, something like.

.. versionadded:: 4.0.0

   It is now possible to use a ``sphinx.yaml``,
   in-place of the ``conf.py`` (the two may not co-exist).
   This should contain a valid `YAML <https://yaml.org>`__
   dictionary of the configuration variables.
ericholscher commented 3 years ago

I think mostly just something that mentions it exists and explains the semantics (eg both can’t exist at the same time). We can build on it later and make other decisions around where to include it.

tk0miya commented 3 years ago
  1. The Great Format War: YAML vs TOML

My opinion is not changed since I commented at https://github.com/sphinx-doc/sphinx/issues/9040#issuecomment-814961565:

IMO, YAML is widely used than TOML. The one of the goals of this issue is supporting commonly used file format as configuration of Sphinx. So I'd like to vote to YAML.

If my understanding correct, almost people in this issues voted to YAML. So I'd like to choose it.

  1. Co-existence of dynamic and static conf Maintainers @tk0miya and @shimizukawa have expressed that they would not want to allow co-existence of dynamic and static conf at this point.

Indeed, I and shimizukawa vote to disallow the co-existence. So there is no additional step. But it seems everyone says "v1", or "this is the first step of static configuration". Does it mean we have the next step? I think it's an important point.

Additonally, @jakobandersen; a maintainer of Sphinx votes they can co-exist. As I saw his reaction in https://github.com/sphinx-doc/sphinx/pull/9170, I guess he thinks it should not be merged if we choose the "not co-existent" configuration (Is this correct? Please let me know your opinion). I think the discussion is not agreed yet. So we have to discuss more.

  1. Whether or not to have dynamic behavior in static configuration file @chrisjsewell's PR doesn't introduce any dynamic behavior. This IMO is great at this point. Adding this later if necessary is a lot less breaking than removing it. Another decision that we can scope out to keep this issue manageable.

I think nobody requests that feature. So no more discussion will be needed, I think.

BTW, does somebody uses the sphinx.yaml in the real project? I'd like to see sphinx.yaml can represent the configuration for the real project. It's important before merging it.

choldgraf commented 3 years ago

a few quick thoughts:

But it seems everyone says "v1", or "this is the first step of static configuration". Does it mean we have the next step?

I think the group has rough consensus that sphinx.yaml would be useful and in-scope for Sphinx, and does not have rough consensus that sphinx.yaml should co-exist with conf.py. However, the question of "can they co-exist" does not have to be made right now (since it could theoretically be decided in the future after more discussion). Because of this, I believe it is reasonable to make an incremental improvement (make sphinx.yaml possible) and, if others wish, have focused discussion around "co-existence" in the future and in a different issue.

does somebody uses the sphinx.yaml in the real project

Well no, because it's not currently possible. However a similar example is Jupyter Book, which currently uses a _config.yml file that it translates to Sphinx configuration at build time (though, it must pass each configuration value as command-line options with -D key=value because there's no other way to do this). For some examples, browse the Jupyter Book gallery as all of those books have YAML configuration.

jakobandersen commented 3 years ago
  1. The Great Format War: YAML vs TOML If my understanding correct, almost people in this issues voted to YAML. So I'd like to choose it.

Personally I don't really care about YAML vs. TOML, whichever works and is intuitive. Though, https://github.com/sphinx-doc/sphinx/issues/9040#issuecomment-832655361 and earlier comments indicated some typing and value coercion issues/gotchas that I don't know the details of. There should be tests for such cases, and enough warnings in the documentation for this. #9170 does have a few basic tests, but I think the edge cases should be there as well so we can catch if the behaviour unintentionally changes.

  1. Co-existence of dynamic and static conf

Additonally, @jakobandersen; a maintainer of Sphinx votes they can co-exist. As I saw his reaction in #9170, I guess he thinks it should not be merged if we choose the "not co-existent" configuration (Is this correct? Please let me know your opinion). I think the discussion is not agreed yet. So we have to discuss more.

Yes, I think it is a mistake to merge without an incremental path for the user. This is both for the "old" user who would like to clean up conf.py and transition as much as possible to sphinx.yaml (or whatever name), but also for the new user who find the static file neat, and suddenly need a dynamic value. Without allowing co-existence the answers I see would either be 1) convert everything to conf.py in a non-incremental manner, or 2) create an extension that injects the config value. (1) I would find really annoying and deem a bad user experience. For (2), I guess that is technically possible, but at this moment I don't know how to do it without digging into the documentation or perhaps even the source code. For such a fundamental thing as for example loading the version variable dynamically I think it is simply a bad interface if you started out with sphinx.yaml.

You could argue that a different easy extension would be nice instead of conf.py, but for compatibility I don't see conf.py going away for a very long time, so why not allow conf.py as that easy extension as it is today?

While co-existence can be implemented later, I think the above is enough to argue that users might perceive the current PR as only half a feature. @tk0miya is working hard on getting 4.0 out, so there is no rush to merge anything right now, even if 4.1 is the target.

BTW, does somebody uses the sphinx.yaml in the real project? I'd like to see sphinx.yaml can represent the configuration for the real project. It's important before merging it.

I think this a good point. You could hax this feature in by having a minimal conf.py that uses a YAML parser to load sphinx.yaml and defines the contained config. Some cases could even be converted to tests in Sphinx.

choldgraf commented 3 years ago

In case it is helpful, I tried to make an extension that would allow you to use YAML configuration a while back. It would just look for a YAML file and try to update the configuration environment. Here's what it does:

https://github.com/executablebooks/sphinx-yaml-config/blob/main/sphinx_yaml_config/__init__.py

The idea was that you could then define a minimal conf.py file and then have the rest of your configuration defined in YAML. For example, here's the conf.py for that site

https://github.com/executablebooks/sphinx-yaml-config/blob/main/docs/conf.py

However I ran into several issues:

If anybody has thoughts on that extension (e.g., a better way to initialize or implement it) let me know. Though it still feels quite hacky :-/

hukkin commented 3 years ago

If my understanding correct, almost people in this issues voted to YAML. So I'd like to choose it.

Cool, so the format war has come to an end, it seems, with YAML as the winner :peace_symbol:

The only remaining PR blocker seems to be that @jakobandersen wants conf.py and sphinx.yaml co-existence from the get-go and @tk0miya and @shimizukawa voted against it earlier. How strongly do you @tk0miya and @shimizukawa hold your opinion about this matter? Would you be willing to merge a PR with conf co-existence?

Yes, I think it is a mistake to merge without an incremental path for the user. ... While co-existence can be implemented later, I think the above is enough to argue that users might perceive the current PR as only half a feature.

@jakobandersen I personally don't see how this is mistake nor a half feature β€” it's static configuration for those who only need static configuration. Adding the possibility to have a sphinx.yaml takes nothing away from those who need dynamic conf. They can stick to their conf.py. I think the core audience of this feature (people with super simple docs, perhaps non-Python users), couldn't care less about conf co-existence because they just want the sphinx.yaml anyways. They don't want to program Sphinx.

There's a chance that you, as an advanced user, overestimate the need for dynamic configuration and underestimate the need for simplicity and beginner friendliness. MkDocs is unaware of the concept of dynamic configuration, yet is still super popular and serves most use cases perfectly.

BTW, does somebody uses the sphinx.yaml in the real project? I'd like to see sphinx.yaml can represent the configuration for the real project. It's important before merging it.

@tk0miya Here's a conf.py that I personally would want to convert to a shpinx.yaml. It contains zero dynamic configuration, and I already use a version bump tool to update the version, so I don't care about dynamism for reading the version either.

chrisjsewell commented 3 years ago

But it seems everyone says "v1", or "this is the first step of static configuration". Does it mean we have the next step? I think it's an important point.

The key point is that #9170 is a change that realises the core goal of the feature request, and "locks-in" some key design decisions: the file format, the file name, the file parser. There should be no later changes to these 3 points, since that would be backward incompatible. At the same time, it also allows the relationship between sphinx.yaml and conf.py to be discussed/altered later in a backward compatible manner (since no one will already have both present). Such a change though is not required to achieve the core goal of the feature, and need never be implemented if we cannot get consensus, i.e. #9170 provides a fully-formed, standalone feature on its own. (On the flip-side, if you allow for co-existence now, you can never then change the relationship between sphinx.yaml and conf.py without breaking some existing documentation.)

So to take stock:

Firstly, in addition to the arguments made by others above, there is currently 27 πŸ‘ on the initial comment. So IMO this gives a clear indication that there are people that would like to see this feature.

With relation to #9170, correct me if I'm wrong, but from the current comments I conclude the following stances:

Given that it feels to me that it would be unlikely for any amount of testimony to cause @jakobandersen to change he's stance, then I am confused about what is necessary to move this forward: does this mean then that we can never have this feature, if there is never 100% agreement?

ericholscher commented 3 years ago

Thanks for the summary Chris. I think my view has currently shifted to "Let's implement this feature without co-existence and see what happens". So I'm -0 on co-existence in the first version, and want to see how real users make use of the feature.

I agree that the YAML file as currently implemented in #9170 is a full feature and valuable. I imagine the community will quickly implement an extension that allows dynamic config on top of the YAML file, and advanced users will be able to have this functionality while still relying on a mostly static config file, and average users will use a fully static config, which is my preferred outcome.

choldgraf commented 3 years ago

Gentle nudge on this one - is there a path forward here? It looks to me like these three items are the main bullet points:

  1. We have an open PR to add support for sphinx.yaml. This does not allow conf.py and sphinx.yaml to co-exist.
  2. Lots of people think this would be a useful feature (at least, the 28 πŸ‘ at the top comment)
  3. Several people think that conf.py could also co-exist with sphinx.yaml but that it shouldn't block @chrisjsewell's PR
  4. Several people think that conf.py should not co-exist with sphinx.yaml AKA, that we should implement https://github.com/sphinx-doc/sphinx/pull/9170 and nothing else.
  5. One person thinks that conf.py must co-exist with sphinx.yaml and (I think?) believes the feature shouldn't exist unless this is possible.
jpmckinney commented 3 years ago

As a just idea: Is it helpful to add a hook (or replace) configuration loading phase to implement this as an example extension?

I believe this is the best way forward for this issue: that is, for Sphinx to add a way for an extension to take responsibility for loading the configuration. That way, ExecutableBooks can write an extension that reads sphinx.yaml, @pradyunsg can write an extension that reads sphinx.toml, etc.

Obviously, there are questions to sort out (how does this extension get declared? what if multiple extensions take responsibility? what is the order of precedence? etc.), but I don't think a static configuration file is good for this project, discussed below.


I don't understand the problem. The conf.py file that is generated by running sphinx-quickstart -p Project -a Author and pressing Enter repeatedly is valid TOML. If you replace = with :, it is valid YAML.

A user does not need to learn Python or programming to edit conf.py. For a user to edit YAML or TOML files, they need to learn a few things like which values need to be quoted or bracketed, how variable names are separated from literal values, how to indicate lists and mappings, etc. The same effort is required to edit conf.py in simple ways.

If a user has no need for dynamic configuration, then there is really no difference between Python literals, TOML and YAML. Is changing the extension from .py to .toml or .yaml and using : instead of = of such great benefit that it is worth the downsides (below)?

It looks like most contributors to this issue (outside the Sphinx maintainers) are maintainers/users of Executable Books and maintainers of ReadTheDocs. While these are important projects, they are not the universe of Sphinx users. I would not rush to merge the PR without more discussion.

I think not enough attention is being given to the kinds of issues raised by @jakobandersen in https://github.com/sphinx-doc/sphinx/issues/9040#issuecomment-835461828 (discussed more below) and @tk0miya in https://github.com/sphinx-doc/sphinx/issues/9040#issuecomment-820500990 (similar to my comments above).

With respect to the various proposals:

One file in one syntax (i.e. no change) avoids these three types of user challenges that arise from co-existence, which I can summarize as:

I can come up with more classes of user challenges that come with having two configuration files, but, really, I don't see what is wrong with conf.py, since in its basic form it is valid TOML and (after sed 's/ = /: /' conf.py) is valid YAML. If folks really want alternative formats, @tk0miya suggested an option (just as an idea) in the quoted text above.

FWIW, the comparison to setup.py and setup.cfg is not relevant, since it is generally programmers who write Python packages (and who are therefore capable of understanding the difference between the file formats, etc.), whereas this issue is intended to help non-programmers.

chrisjsewell commented 3 years ago

Oh well I guess that kills any slim chance of having this issue resolved

whereas this issue is intended to help non-programmers.

Its funny, I was looking for any other points that had not been previously been mentioned, on why it is desirable to have a non-Python configuration (outside the fact that literally every other tool uses such a config), and a top answer was: https://softwareengineering.stackexchange.com/a/351128: "So using a script for configuration is likely OK if the audience of your tool is developers, e.g. Sphinx config". That sums it up for me; most casual users see sphinx as a developer only tool and will just opt for something more broadly appealing like mkdocs (before even getting to any of the points you suggest). That's fine, I guess it is the focus of sphinx, I'd just hoped for something more.

jpmckinney commented 3 years ago

My sense is that the Sphinx maintainers would also have preferred a simple config file (INI was mentioned), with anything more complex being implemented either as extensions, or as new configuration variables - but unfortunately Sphinx needs to contend with its legacy.

conf.py gives users a lot of power, and there are several examples in this issue of simple and complex configurations in real projects that involve executing code. Removing power from users is rarely a popular change.

I think it’s possible for Sphinx to support static configuration files, but I’m not sure the current PR solves more problems than it creates.

To make things simpler for users, it should at least be possible to choose to generate a static configuration file when using sphinx-quickstart (see also https://github.com/sphinx-doc/sphinx/issues/9040#issuecomment-812637341).

astrojuanlu commented 3 years ago

Just a quick comment on one of the points:

It looks like most contributors to this issue (outside the Sphinx maintainers) are maintainers/users of Executable Books and maintainers of ReadTheDocs. While these are important projects, they are not the universe of Sphinx users.

Unfortunately, nobody knows "the universe of Sphinx users", since as far as I know the Sphinx project does not conduct user surveys of any sort. The people here proposing a static configuration have collected a handful of opinions though, which is better than nothing.

And, more importantly, as @chrisjsewell points out we are trying to expand the universe of possible Sphinx users. If we only consider the ones already on it, we risk making suboptimal decisions (mandatory reference to the "Survivorship bias" airplane picture).

Of course, we need to do this without alienating or confusing existing and new users. But I think there are ways to move forward and offer something better or simpler without getting stuck on "Sphinx legacy".

jakobandersen commented 3 years ago

Hmm, @jpmckinney quite effectively lays out (https://github.com/sphinx-doc/sphinx/issues/9040#issuecomment-855178662) the problems with both coexistence and non-coexistence. No matter what it can be quite harmful for people getting into using Sphinx.

One way forward would be to get some experience with static configuration without direct support, to see which problems users run into. I think this may already be doable with very little "magic" to be copy-pasted: In conf.py put

import myFavouriteStaticSphinxConfFormatLoader
for k, v in myFavouriteStaticSphinxConfFormatLoader.load(__file__):
   globals()[k] = v

where myFavouriteStaticSphinxConfFormatLoader is a third-party module (not an extension, no Sphinx entities are needed). If it is no be distributed with the project, then a line with sys.path.append() could be added first.

hukkin commented 3 years ago

Hmm, @jpmckinney quite effectively lays out (#9040 (comment)) the problems with .... non-coexistence.

I honestly fail to see the point regarding non-coexistence. The argument seems to be that it's somehow overly laborious to migrate from YAML/TOML to Python but

chrisjsewell commented 3 years ago

To make things simpler for users, it should at least be possible to choose to generate a static configuration file when using sphinx-quickstart

Oh indeed πŸ‘. But, in terms of my PR, it was a chicken and egg situation, whereby I didn't want to spend a long time writing the "complete solution", went it was very possible that it would be rejected

Hmm, @jpmckinney quite effectively lays out (#9040 (comment)) the problems with both coexistence and non-coexistence. No matter what it can be quite harmful for people getting into using Sphinx.

again, respectfully, I very much disagree; in that the status-quo is already harmful for people getting into sphinx. I don't want to sound too negative on sphinx, on the contrary, I think its very powerful but with a lot of untapped potential.

One way forward would be...

This feels to me like a self-fulfilling prophecy; you're going to ask users to install another package on top of sphinx, then they have to still create a conf.py, plus another configuration file. It's not going to be surprising when users report this as a poor experience.

jpmckinney commented 3 years ago

I think non-coexistence could be an option if:

  1. New configuration options are added to Sphinx for the common uses of Python-only features. Some examples already: https://github.com/sphinx-doc/sphinx/issues/9040#issuecomment-815105505 https://github.com/sphinx-doc/sphinx/issues/9040#issuecomment-818873536
  2. A bit of research shows that very few popular extensions, configurations, etc. need/use Python-only features in conf.py, outside those added through the first bullet.

If that's the case, then users will indeed rarely have to switch format, in which case the downsides to non-coexistence are appropriately mitigated. The static file could then be the default format, since leaving conf.py as the default format doesn't serve the broader goal of expanding the universe of possible Sphinx users.

Without step 2, we're just theorizing about what users are likely to do, and making a decision purely on strength of belief.

Update: To clarify, if a project has a hugely complicated conf.py, we can assume that they are not the kind of user who would have started with a simple static file. Hence the word "popular" in step 2.

In the same post an argument is made about how a YAML/TOML is almost valid Python already, and how replacing one character does the trick

It's the other way around - a Python file like the default conf.py that only does assignments of literal values is valid TOML. The following YAML (YAML has many syntax options, but the following is most common) is obviously not valid Python, and converting it would require learning Python's syntax:

mylist:
  - item1
  - item2
mydict:
  dk1: one
  dk2: two

I'll start a bit of my own research here. Any help will advance this issue: