econ-ark / HARK

Heterogenous Agents Resources & toolKit
Apache License 2.0
336 stars 198 forks source link

document jupytext dependency, add configuration file #976

Open sbenthall opened 3 years ago

sbenthall commented 3 years ago

The HARK library's dependence on jupytext for development of the examples is not documented anywhere.

It also looks like there's a way to support it through a configuration file, though we have not included anything like this in the library. https://github.com/mwouts/jupytext/blob/master/docs/config.md

This is a serious omission if we're going to start testing the examples/ as part of the library, because tacit differences in how people are using jupytext could lead to inconsistencies.

llorracc commented 3 years ago

The HARK library's dependence on jupytext for development of the examples is not documented anywhere.

This is a serious omission if we're going to start testing the examples/ as part of the library, because tacit differences in how people are using jupytext could lead to inconsistencies.

Agreed.

QuantEcon spent a lot of effort struggling with the issues that jupytext tries to solve (basically, meaningless changes in notebooks, like timestamps, constantly demanding attention), and ended up adopting restructured text as their "ur-language" from which everything is translated.

We should not do that. But we should come up with better ground rules so we don't keep having these problems.

I don't have a strong opinion about the solution. But I can sketch what a solution might look like.

  1. What we post to DemARK, and examples, will only be *.py files (enforced by .gitignore)
  2. When persons not very familiar with econ-ark explore our content, what they see -- via nbviewer, not via the flaky GitHub -- are autogenerated *.ipynb files stored somewhere else (so they won't confuse US). Maybe Mridul can write a GitHub action that will look for new merges into master on DemARK or HARK/examples and when they occur can run jupytext py:percent to create the updated notebook
  3. Part of being sufficiently competent to make contributions is to learn enough to install jupyter lab with the jupytext extension so that you can open and view .py files as notebooks, but convert them back to .py files for actual posting

This is cumbersome. But less cumbersome than having to struggle with which one is more up to date -- the notebook or the py file every time we do an update -- especially if some of the contributors (like my teaching assistants or grad students who contribute material) may not be deeply knowledgeable about the ways of git, jupyter, jupytext, etc.

Am open to alternative suggestions that balance:

  1. There is an unambiguous Single Source of Truth
  2. It is easy for newbies to make new contributions (like, as class assignments)
  3. It is easy for us to handle the workflow
  4. It is easy to view the resulting contributions.
sbenthall commented 3 years ago

It looks like there is also a way to do syncing with a pre-commit hook:

https://jupytext.readthedocs.io/en/latest/using-pre-commit.html

I think it's better to do configuration and enforcement in a way that can be stored in version control and executed in the development environment. I don't think automated code changes are a good way to go.

I think that if we continue to require use of jupytext in any way, we need to be very specific of how we intend people to use it. I'll admit I did not know about the --sync feature until Mridul mentioned it just the other day; I had been using --to. The Jupytext connection to Notebooks and Labs is different, so there's lots of room for user error. I can't say with confidence that I have the necessary competence to use jupytext and jupyter lab, and I've been managing with the two together now for a year.

I don't like the idea of storing .py files in the repository and rendering them elsewhere. I think it's much more common to have Notebooks in repositories without a .py file or jupytext complicating things. But I don't feel very strongly about that. I

am generally skeptical about workflows that involve rendering from one repository to another. I haven't seen much compelling success with that approach.

sbenthall commented 3 years ago

Another thing to consider: the examples notebooks in the repository currently get picked up by Sphinx and rendered into ReadTheDocs.

It is possible that with yet another layer of automation, we could get ReadTheDocs to render .py files into notebooks and then those notebooks into RTD. But I don't at this time know how to do that and it would be an additional possible point of failure.

The current automated test for the DemARKs tests the notebooks, not the .py files. A different way to do automated testing of the HARK examples, which is what I assumed #944 was doing, was to whip through the notebooks to make sure they don't fail to execute.

llorracc commented 3 years ago

examples, which is what I assumed

Another thing to consider: the examples notebooks in the repository currently get picked up by Sphinx and rendered into ReadTheDocs. It is possible that with yet another layer of automation, we could get ReadTheDocs to render .py files into notebooks and then those notebooks into RTD. But I don't at this time know how to do that and it would be an additional possible point of failure.

Ah, I hadn't realized that. Am sympathetic to not wanting to create another point of failure. But am tired of dealing with complexities of dealing with jupytext sync. (Esp. last two days; it uses timestamps to decide whether .py and .ipynb files are in sync, but very frequently the timestamps are off even though the two files ARE in sync).

The current automated test for the DemARKs tests the notebooks, not the .py files. A different way to do automated testing of the HARK examples, which is what I assumed #944 was doing, was to whip through the notebooks to make sure they don't fail to execute.

I think that's what we ARE doing -- are we not using pytest --nbval-lax for the notebooks in examples, as we are for the notebooks in DemARK? What are we doing instead for the notebooks in examples?