Closed jared321 closed 2 months ago
Yes, I have seen certain repositories that have test_requirements.txt
files which specify the dependencies that need to be installed to run test. Is that the kind of solution you envision here?
That might be one way to go. To be clear, I created this Issue to make sure we don't forget about this. I doubt that this is high-priority. Do you agree?
We can think about this in three ways: 1) What is needed for GitHub actions? 2) What is needed for developers? 3) What is needed for users?
In our next meeting, we can go over what I presently have done, which provides a single solution for (1) and (2) and potentially some users.
I grepped each dependence in pyproject.toml
in the package's *.py source files and removed all those that don't appear anywhere.
I suspect that some of the original dependencies were included originally so that each Taweret wheel had all dependencies needed for all tasks (e.g., building sphinx docs, running code coverage). However, users wouldn't generally run many of those and therefore, it makes sense to move them out. This is acceptable in terms of developers because the tox tasks know what extra packages are needed and automatically install those.
In the case that some removed packages were included as dependencies of Taweret's actual dependencies, I propose that we assume that each Taweret dependence correctly specifies its dependences so that pip should automatically install the full stack. If the assumption is valid, then maintaining the list of actual dependencies should be easier this way.
@asemposki I've advance the work for this issue and have some questions.
The notebook-only dependency, seaborn
, is now specified in the book tox tasks so that the GH jupyterbook actions have everything they need. Should the need to manually install seaborn
be noted somewhere in the examples portion of the documentation? Or are the examples now really only meant for people to read in the online book?
bilby
has many layers of requirements. At the moment, I am just assuming that tox, the actions, and users will use its mandatory dependencies
https://git.ligo.org/lscsoft/bilby/-/blob/master/requirements.txt?ref_type=heads
which includes some packages used by the notebooks (e.g., corner
and pandas
).
I didn't find ptemcee
being imported anywhere in this repo. However, it appears to be needed by the examples through bilby
. It is only one of many optional sampler dependencies
https://git.ligo.org/lscsoft/bilby/-/blob/master/sampler_requirements.txt?ref_type=heads
How is Bilby
integrated into Taweret? Is the use of ptemcee
hardcoded or do users have the ability to specify a particular Bilby
-supported sampler?
Based on your responses, how should we manage these sampler dependences? Should we include some or all of the samplers as mandatory dependences in pyproject.toml
so that all users install them even if they don't use them? Should the docs tell users to manually install their samplers of choice or to just install all samplers as shown here?
https://lscsoft.docs.ligo.org/bilby/samplers.html#installing-samplers
In such a case, I would move ptemcee
to the book tox tasks.
We'll have to think about this more after the review in terms of ensuring that we catch when a sampler you use is no longer offered in Bilby
.
@jared321 I envisioned the Jupyter Book as a place to link from so users could run those notebooks in their own code space or locally if they wish, so I would think that seaborn
is then a local dependence. It could also be completely removed since it is only for colour palettes and plot styling, but I vote we leave it and make it a dependence when running notebooks locally (so we can tell users to install it for local work).
bilby
is used by Taweret as the official sampler wrapper, so when you call it, you can specify the sampler type. Dan's notebooks show the various arguments the user can specify externally so that a different sampler, number of walkers, steps, output printing, etc., are chosen. So yes, the idea would be to allow users to use any of the samplers available through bilby
, hence this is where ptemcee
is included.
I'm not 100% sure which option is best---including all samplers from the start or downloading them as needed---but given that the bilby
documentation claims that if a user tries to use one that is not installed, a "helpful message will be printed", that may be enough, meaning that we only use the default dependences of bilby
and then users can install more if they want a different sampler. Probably for cleanliness and keeping things as light as possible, this would be the best option.
@asemposki It sounds reasonable to me that we ask users to manage the optional samplers in their installation since this gives them more freedom and power to manage their needs. It also seems acceptable since using a different sampler requires that the user manually specify the use of that sampler - there should be no surprises.
If I understood correctly, then
seaborn
& ptemcee
are to be installed manually by users who want to run the example notebooks locally and we will note in the docs that this user responsibility existsseaborn
& ptemcee
will not be declared as dependences in pyproject.toml
seaborn
& ptemcee
will be declared as dependences in tox.ini
for the book taskspyproject.toml
or tox.ini
. We will note in the docs that users are responsible for manually installing all Bilby-supported samplers that they wish to use.Correct?
Yes, and I will include in those notebooks a commented out !pip install seaborn
and !pip install ptemcee
so that they can see to install those there to run the examples. It seems reasonable to me to keep dependences to a minimum.
This is now in main in preparation for the next release. Closed.
The package currently declares many external Python package dependencies and indeed many packages are installed with Taweret. However, some of these are likely not true dependencies (i.e, not needed for a minimal, functional, correct installation).
For example, GitHub actions need to test coverage, but users don't. So, no need to install
pytest-cov
. Similarly, GH actions need to generate the sphinx documentation, but most users won't. If they do, I think that it's reasonable that they install all the sphinx packages manually. Need to figure out what to do for notebook-based examples.