Cantera / enhancements

Repository for proposed and ongoing enhancements to Cantera
11 stars 5 forks source link

Version-specific cantera.org release cycle #150

Closed ischoegl closed 2 years ago

ischoegl commented 2 years ago

Abstract

The upcoming removal of CTI/XML will remove substantial amounts of legacy code, which affects both Cantera/cantera and Cantera/cantera-website. To date, the repositories follow separate philosophies:

Cantera/cantera:
 ├─ `main` ... development branch
 ├─ `2.x` ... stable branches

and

Cantera/cantera-website:
 ├─ `main` ... stable branch (with *new in upcoming* features) --> pushes to `cantera.org`
 ├─ `testing` ... development branch --> pushes to `testing.cantera.org`

This feature request aims for a more consistent approach that prevents bottle-necks and merge conflicts prior to releases while resolving Cantera/cantera-website#88. Code updates for Cantera 3.0 will be voluminous, so a potential change of approach is worth discussing.

Motivation

Describe the need for the proposed change:

Possible Solutions

Use the same approach for both main repository and website, i.e.

Cantera/cantera-website:
 ├─ `main` ... pushes to `dev.cantera.org` (name tbd)
 ├─ `2.6` ... pushes/links to `cantera.org` (stable/released version)

where new releases follow the same procedure as on the main repository. There won't be pushes to stable except for bug fixes or updates that cannot wait for the next release cycle.

Some necessary tweaks in CSS for the unstable version:

Some disadvantages:

Some advantages:

Other thoughts:

References

speth commented 2 years ago

We used to have the website built entirely from content stored in the Cantera/cantera repository, which gave us the feature that's desired here of having a clear place to stage changes that are applicable only to the unstable version / next release. However, one of the major reasons for separating the website into its own repository (a project spearheaded by @bryanwweber) was the fact that this didn't provide a good way of making updates that weren't version specific.

I took a look at the PRs that have been merged to the website repository since we released version 2.5.1, and found the following.

So I think the current organization of having the default branch (which is the default target for PRs) be the branch that generate sthe main cantera.org site is a sensible choice, as merging into that branch is what should happen for most updates made outside a short period preceding a release, where it's easier to remember to be careful about selecting the target branch for a PR.

For documenting new features, I think having information available on cantera.org (rather than dev.cantera.org) is completely fine, as long as those features are marked as "New in Cantera X.Y". And though we haven't been great about doing so before the current release, I think using those markings and maintaining them even after release would be a good practice to help people see how Cantera is evolving.

You're right that there will be a substantial update to the documentation related to the removal of the CTI/XML formats. However, I think the only way to do this without creating a maintenance burden is to wait to do that until we're close enough to the 3.0 release that we actually want to start maintaining a branch for updates that should go live only with the code release. The earlier we start such a branch, the more likely it is that we won't be able to merge/cherry-pick some of the more common version-independent changes.

I appreciate the thinking on this issue. Clearly, whatever we decide to do, just documenting the workflow and keeping it in mind when we're preparing and reviewing PRs will be an improvement over the current ad hoc process.

ischoegl commented 2 years ago

@speth ... thank you for the historical context - this is certainly something that I appreciate. Also, thank you for dissecting website updates between 2.5.1 and 2.6. At the same time, the points I brought up were based on my experiences and observations during this release cycle, which left me wondering whether the work flow couldn't be improved.

Separate repositories: I think that is generally the right call. Cantera/cantera is large even without the website. But I'm not sure that there is a need for a website that covers more than the current stable release. Ultimately, this is what most of the users will refer to, so there is little benefit of adding documentation of new features ahead of time. The website as it stands is pretty good, and there definitely are drawbacks of the current situation.

The current structure is error-prone. Take for example recent changes to Science (Cantera/cantera-website#179), which broke section links that were inadvertently (and independently) discovered and fixed in Cantera/cantera#1249. Links to new features in the dev version are awkward (see issues with broken links in Blowers-Masel, and the requirement to add TODO: Update links once version x.y is released).

I don't think that stats on PR's to the current website are entirely meaningful as they are a function of the current work flow. The review cycle isn't smooth: updates are (by necessity) deferred until a later point, and reviews are likewise not a priority. The ensuing lags between preparation/review of features and the necessary documentation is frustrating, and may lead to less-than-ideal results. It's imho better to write documentation when memory of a feature is fresh, and having results appear on dev.cantera.org immediately may be a useful motivation.

[...] there will be a substantial update to the documentation related to the removal of the CTI/XML formats. However, I think the only way to do this without creating a maintenance burden is to wait to do that until we're close enough to the 3.0 release that we actually want to start maintaining a branch for updates that should go live only with the code release. The earlier we start such a branch, the more likely it is that we won't be able to merge/cherry-pick some of the more common version-independent changes.

This is a somewhat scary prospect and was part of the motivation for putting this enhancement request together. Please note that my suggestion avoids merge/cherry-pick altogether, as source code for cantera.org is released, and won't require a merge back into main.

Overall, from my side the positive aspects of updating the work flow as outlined at the top of this issue report outweigh the status quo. Ultimately, it is my interest to make for a smoother experience: the hurdle before a release appears to be unnecessarily large.

bryanwweber commented 2 years ago

I'm going to reply to a few specific quotes first and in the next comment consider the actual proposal from the OP and how we might move forward.

  • As long as indexing is prevented, there is no issue posting links to the development version in more prominent places (which is currently avoided for testing.cantera.org)

As just a tiny bit more historical context, we avoid links to testing.cantera.org because it tended to represent experiments, especially with fixing dependencies, using new HTML templates/themes, etc. As such, the state of the testing.cantera.org was never intended to be a consistent version or read by users. It was never previously intended to be a place to collect commits related to an upcoming/unreleased version, and to the extent that we did that immediately prior to this release I see as more of an accident and needing something quickly that just worked.

That said, on a separate but related note (as you suggested in the OP), we should probably do a better job of noindexing content that we don't intend to be found by search engines.

But I'm not sure that there is a need for a website that covers more than the current stable release. Ultimately, this is what most of the users will refer to, so there is little benefit of adding documentation of new features ahead of time.

I disagree with this position. I think that since we only release once-a-year (and until the last three releases, somewhat less frequently), it's important to have this documentation publicly available for people who want to use features only on the main branch. This is especially true because main tends to be fairly stable, so I think you could use main to do "real work" in general.

The current structure is error-prone. Take for example, ... which broke section links... Links to new features in the dev version are awkward...

I definitely agree that there is a good possibility of errors with links 😄 However, I would argue that the specific cases you mentioned here would have happened even if the entire website was hosted from the Cantera/cantera repo... It's just hard to manage all the links that we have all over the place. This is exacerbated by having separate repos, but especially because we're mixing Nikola and Sphinx and Doxygen. To me, the actual problem to solve here is to have a consistent way to link across documents using identifiers that are less likely to change and have our tools manage writing the actual HTML links. We can also consider introducing some linting that checks that links don't break.

The review cycle isn't smooth: ... reviews are likewise not a priority. ... having results appear on dev.cantera.org immediately may be a useful motivation.

I agree we can do a better job with timely reviews; it's basically a bandwidth and interest issue (writing new features in Cantera/cantera is more fun, etc.). I think we can also require docs for new features here before merging PRs in Cantera/cantera; this is a loose requirement that we haven't always enforced but which we could do going forward. This means we could make new docs available on cantera.org immediately, which would put it in front of more people and possibly get more people excited to use the next version.

However, I think the only way to do this without creating a maintenance burden is to wait to do that until we're close enough to the 3.0 release

Regarding the removal of CTI/XML, YAML has been available for a while now and seems quite stable. In the docs hosted in Cantera/cantera-website, most of the CTI/XML content is just links from the "science" type content to the relevant options sections (as far as I recall). In my opinion, we can remove that content immediately (or, at least, when all of that is ripped out of Cantera/cantera) with essentially no loss... we're already strongly encouraging people to use YAML, and people shouldn't be writing CTI/XML from scratch at this point, which I think is the main use of those links. The actual documentation of the options will still be available under the cantera.org/documentation link, so it'll be a little harder to find, but again, I don't see that as a huge problem.

bryanwweber commented 2 years ago

Ok, to address the actual proposal for changes here:

Use the same approach for both main repository and website, i.e.

Cantera/cantera-website:
 ├─ `main` ... pushes to `dev.cantera.org` (name tbd)
 ├─ `2.6` ... pushes/links to `cantera.org` (stable/released version)

where new releases follow the same procedure as on the main repository. There won't be pushes to stable except for bug fixes or updates that cannot wait for the next release cycle.

I think this kind of structure would work better once we actually have complete documentation for more of the features in the Cantera/cantera. I think we have pretty good docstrings, and the content in the C++ documentation is reasonable, but high-level content is largely missing.

As such, I anticipate that the majority of the updates that we'll make until we have more complete documentation are changes that would be relevant to the current release. For example, I'm thinking of adding Peng-Robinson documentation, or lifting things out of the C++ docstrings to the "Science" section. As these would apply to the stable version, we'd end up having to backport it from main to (say) a stable branch.

As far as improvements to the process, I think the hardest thing (aside from cross links) that we have to do at release time is make sure that the api-docs are updated (@speth please correct me if that's wrong, since you did all the stuff for this release). With how this repository is currently structured, I'm not sure how to improve this step... but I can imagine that we can rearrange this repository to make things better. I'll consider that and try to come up with a concrete proposal.

To fix the cross links, I'm considering switching the website back to Sphinx. It seems like combining Sphinx-generated sites will be easier than mixing generators, and I'd like to be able to use MyST Markdown, which is easy to add to Sphinx. This will also allow us to delete several of the plugins I wrote for the Nikola site that accomplish features which are built-in to Sphinx.

As @speth said, thanks for starting this discussion @ischoegl. I think it's important that we make some improvements here. So, items that I'm taking away from the discussion so far:

  1. We should document the workflow we use on the website and how that ties in to Cantera/cantera development
  2. We should enforce adding documentation for new features in Cantera/cantera as appropriate for a given pull request
  3. We should make it easier to build this site (for example, by switching to Sphinx) to make it easier to enforce 2.
  4. We should improve how the api-docs are provided to the website and stored in this repo
ischoegl commented 2 years ago

@speth and @bryanwweber ... thank you both for your thoughts.

I think the discussion was fruitful, even if my original suggestion may not have support. I know that there have been long-standing goals to improve the documentation (#6, dating back to 2019, although duplication was pointed out in Cantera/cantera-website#20 and efforts to catalog and verify classes go back to at least Cantera/cantera#267). Cantera/cantera-website#146 is a small initial step, so I hope that this will pick up some - arguably the science documents qualify as "reusable and open educational materials".

I am personally not too unhappy with the Nikola approach (it works), but can see that writing documentation could be easier / more approachable in MyST markdown. Going back to Sphinx (or even further to jupyter-book, which is just 'opinionated Sphinx') is likely a major effort; I am not keen on volunteering myself, so I don't want to force that on anyone.

The one thing I am hoping for is to be more proactive on the website: documentation of new features should be more consistently posted immediately and appropriately marked (new in Cantera X.Y), perhaps with the exception of experimental features. Likewise, we should consistently mark deprecated code immediately (removed after Cantera X.Y) - as is currently already done for CTI/XML. Transitions between versions will be arguably harder to administer, but there appears to be broad support for this work flow.

One thing that we could perhaps improve in the API docs is the formatting for the 'deprecated' admonition, and to potentially define a 'new in' admonition for features that are added in the development version. That way, there may be a way to always link to the most current API docs as they would cover both the latest stable version as well as the upcoming development version. This would also avoid the noindex situation.

PS: Also linking #115.

bryanwweber commented 2 years ago

but can see that writing documentation could be easier / more approachable in MyST markdown. Going back to Sphinx (or even further to jupyter-book, which is just 'opinionated Sphinx') is likely a major effort; I am not keen on volunteering myself, so I don't want to force that on anyone.

The bigger thing to me is being able to delete the plugin code ☺️

documentation of new features should be more consistently posted immediately and appropriately marked

I agree, but I don't think this was a huge problem for this release. Are there examples you can think of, aside from Peng-Robinson?

improve in the API docs is the formatting for the 'deprecated' admonition, and to potentially define a 'new in' admonition for features that are added in the development version.

Agreed, this would be nice

ischoegl commented 2 years ago

I agree, but I don't think this was a huge problem for this release. Are there examples you can think of, aside from Peng-Robinson?

On the Kinetics side, the infrastructure is thankfully well developed, so these updates mostly worked (although for some of my own updates it made sense to wait until the last possible moment). The ThermoPhase side is by far the largest problem area (as for Peng-Robinson, at this point the science documentation is so sparse that there isn't even a good spot to put it). As for other examples, ExtensibleReactor may deserve a mention here; other than that, I probably should have written up kinetics derivatives, but convinced myself that it was still 'experimental' until the preconditioned solver is ready.

On a separate, - albeit related, - note: I just discovered another broken link :stuck_out_tongue_winking_eye:

If we find consensus for admonitions, this should be relatively easy ... especially if we figure out a way to link to the current API docs, it would be a huge step forward. Based on where this discussion is going, it would be great if we could eliminate testing.cantera.org.

ischoegl commented 2 years ago

Update: I looked over the Sphinx documentation this morning, and suitable directives already exist (which is not entirely surprising):

.. versionadded:: 2.6
   The *spam* parameter.

and

.. versionchanged:: 2.6
   The *eggs* parameter.

We are obviously already consistently using

.. deprecated:: 3.0
   Use :func:`spam` instead.

Fwiw, the seealso directive may be useful as well.

I tested the versionadded feature in Cantera/cantera#1274, where output is currently visible in the artifacts for the GH action: https://github.com/Cantera/cantera/actions/runs/2275446481

From my perspective, a consistent usage of these features would be useful in general, and potentially even allow us to link to the current API docs (rather than stable API docs), as everything is appropriately marked.

speth commented 2 years ago

Doxygen also apparently has the @since directive for this purpose, though I haven't looked to see if it gets formatted in a useful way.

ischoegl commented 2 years ago

@speth ... I really like the @since correspondence to versionadded. There's another directive (@version) that could be used in parallel to versionchanged.

Fwiw, I added some instances of @since to Cantera/cantera#1274. There aren't any surprises in the rendered output (other than finding a pre-existing glitch that I reported in Cantera/cantera#1275).

ischoegl commented 2 years ago

@speth / @bryanwweber .... Ok. After running some tests, I believe the following modified workflow would be workable:

New Features: Use versionadded (Sphinx) / @since (doxygen) directives when features are first introduced; directives can be removed in the version after a feature becomes permanent.

Changing Features: Mark feature as deprecated before feature is changed (same as current approach), but modify to versionchanged/@version once change is applied (new behavior). Remove directives in the subsequent release.

Deprecated/removed Features: Similar as before: mark as deprecated / @deprecated before feature is slated for removal, raise error in subsequent release while updating the deprecation message, and completely remove thereafter.

I believe that using these markings, API documentation could be valid for both previous stable and current development versions. The added effort is modest, while going a long way towards having consistent and up-to-date documentation on the website - as such, it would resolve Cantera/cantera-website#88.

PS: For the CTI/XML removal, it is probably sufficient to leave a small number of strategically selected functions that raise the necessary errors while maintaining adequate documentation during the transition.

ischoegl commented 2 years ago

Closing, as there is consensus that the originally proposed idea should not be pursued. I think that using additional directives (versionadded/@since) would still make sense, but it is an afterthought in this specific feature request.