Closed teddygroves closed 9 months ago
Hi @teddygroves just updating you that we have found a guest editor that is a good fit for your submission; we are onboarding them and expect to start the review late this week or early next week.
In the meantime I will get the EiC checks done.
Thank you for your patience.
Hi again @teddygroves overall I can tick off most of the boxes for editor checks.
There is one issue we do need you to address before we can proceed.
__all__
so that it's clear what "belongs" to bibat
vs other things you import, e.g. importlib. A bit confusing to do >>> dir(bibat)
and see importlib
.I also ran into a couple of minor issues that I describe in comments below but those are just there for feedback.
Please get those API docs added so we can go ahead with the review! Thank you.
Hi there! Thank you for submitting your package for pyOpenSci review. Below are the basic checks that your package needs to pass to begin our review. If some of these are missing, we will ask you to work on them before the review process begins.
Please check our Python packaging guide for more information on the elements below.
import package-name
.README.md
file with clear explanation of what the package does, instructions on how to install it, and a link to development instructions.CONTRIBUTING.md
file that details how to install and contribute to the package.
Code of Conduct
file.YAML
header of the issue (located at the top of the issue template).ModuleNotFoundError: No module named 'importlib.metadata'
, which of course makes sense since this was added after Python 3.7, but I would expect the pip install bibat
to fail because of the requires_python
option in setup.cfg. It shows up properly as >=3.9 on PyPI. Not sure if there's a way to fix this so pip install bibat
fails in a Python 3.7 venvmake
means that Windows
users will have a harder time using bibat
pip
installing requirements.txt
is likely to lead to issues on some platforms, e.g. for me I could not run the first make analysis
to test that setup worked (as in the vignette) on Mac OS, because of a conflict with SciPy: $ make analysis
. .venv/bin/activate && (\
python -m pip install --upgrade pip; \
python -m pip install -r requirements.txt --quiet; \
install_cmdstan ; \
)
Requirement already satisfied: pip in /Users/davidnicholson/opt/anaconda3/lib/python3.7/site-packages (21.3.1)
Collecting pip
Using cached pip-23.0.1-py3-none-any.whl (2.1 MB)
Installing collected packages: pip
Attempting uninstall: pip
Found existing installation: pip 21.3.1
Uninstalling pip-21.3.1:
Successfully uninstalled pip-21.3.1
Successfully installed pip-23.0.1
ERROR: Cannot install -r requirements.txt (line 1) and scipy because these package versions have conflicting dependencies.
ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts
/bin/sh: install_cmdstan: command not found
nox
with conda
environments, which might be more platform agnostic. And/or a workflow tool like snakemake
. Nox can have its own bootstrapping issues (see comment near end of https://github.com/pyOpenSci/software-submission/issues/68#issuecomment-1426357532) but it is mostly platform-agnostic and lets one write pure Python.Thanks a lot for the checks and suggestions and for finding a reviewer!
I will add API documentation and follow the __init__.py
convention that you suggest as soon as possible, and also check if there is a way to fail sooner for unsupported Python versions.
The documentation page is hosted on readthedocs. I'll have a look and see if I've done anything that makes it slow to load.
I'll have to think a bit about your points re make and pip/requirements.txt, thanks for the pointers to alternatives.
Perfect, thank you @teddygroves
I've now fixed __init__.py
as suggested above, added automatic documentation for the cli and all exposed python functions and added some extra code to setup.py
in order to prevent installation on unsupported python versions.
I checked for possible causes of slow loading documentation but couldn't find much - just a largish (1.3M) html file. I also don't experience slow loading myself. I'll continue to keep an eye on this.
I think the main issue (api documentation) is now addressed. The other suggestions were to look at possible alternatives to make and requirements.txt - I'll look at those as soon as possible!
Perfect, thank you @teddygroves.
We are very fortunate that @mjhajharia has generously agreed to act as guest editor for this review. They will follow up here next, tagging in reviewers to start the review process.
update: @alizma and @oriolabril have agreed to be reviewers for the package, things will be started soon!
Hi! Reviews are due in 2 weeks, here's the template that you can post as a comment to this issue. Feel free to ask anything that might be ambiguous.
The package includes all the following forms of documentation:
pyproject.toml
file or elsewhere.
Readme file requirements The package meets the readme requirements below:
The README should include, from top to bottom:
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
Estimated hours spent reviewing: 26/04 -> 1h 40' (significant time spent going over reviewing guidelines too)
EDIT (May 15th): 2h more going over the code and paper, only thing left is reproducing the vignette
EDIT (May 19th): 1:30h setting up everything again from scratch and running the vignette close to completion
EDIT (May 24th): ~1h running the vignette from scratch successfully after the fixes and improvements to it
Just recording here that in the latest version of bibat (0.1.4) some of the comments above are now addressed.
Specifically:
Regarding dev installation instructions: there are instructions for pip installing the latest commit from the main branch, but there is no separate dev branch so that isn't linked. I'm not sure what is the best practice here but I'm very open to recommendations! Edit: I misunderstood, instructions for installing with development dependencies now added.
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
pyproject.toml
file or elsewhere.Readme file requirements The package meets the readme requirements below:
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
Estimated hours spent reviewing: ~4
Overall, this package shows a lot of promise as a foundation for statistical analysis codes and provides a reusable, extendable base for the Bayesian workflow. Moreover, the utilities for generating reports are very nice. Here are a few general comments and things which could be improved in addition to other issue-specific comments above:
pre-commit
to be used in a developer environment would be useful. This requires the addition of a single file which sets up local checks prior to a commit and ensures that everything locally matches with CI requirements. Bibat
generates will be very useful for users, and you give a great example of transforming the set up code to a custom scenario. investigate.ipynb
, it doesn't seem that arviz
is being shipped as a requirement, meaning the default notebook cannot be run on first set-up. The current documentation is what seems to require the most attention. Specifically, some sections are entirely incomplete (as mentioned earlier) and a generic documentation for the generated environment are not provided. It suffices to automatically generate these for the default environment and include them on the overall documentation page. Additionally, the documentation should be split up into separate pages. For cleanliness, having a separate page for more details on the CLI, one for the default provided functions, the editing example you provided, and so on. Additionally, given the tools for automatically generating reports, it would be nice to have a generic example for doing this, especially for people who might not be familiar with using these tools.
The comments above with regards to default function documentation on the main documentation page apply equally well to the default documentation. Since projects based on this set-up will likely have problem-specific commentaries on each function, this is an additional important feature.
You also seem to be missing a linter check in your tests (use black or something similar). Also I'm not too sure how to assess the test coverage in your CI, but maybe you could check a few other things besides running the commands for the default example?
Thanks very much for the comments @alizma! I think they are now just about all addressed, and the package is overall a lot better as a result.
Pre-commit is a much nicer solution for linting than what I had previously.
Test coverage is now displayed in a readme badge.
Information about installing Cmdstan, as well as some lists of other dependencies, has been added to the readme and documentation.
The documentation has been split into multiple pages and generally improved. In particular, it is now explained at greater length how to document a bibat project, and the example Sphinx documentation includes automatic documentation generation for the data_preparation
module.
A general problem that I don't really know what to do about is documenting and testing the example project code. Due to use of templates it generally isn't valid code, which makes automation difficult. What I've done is check that the example project runs from end to end without any errors, but there is definitely more that could be done.
Please let me know what you think, especially about the last point!
Hi @teddygroves, these all look like great improvements, nice work! I'll update my initial review to reflect the changes.
With regards to additional testing, since sections of the provided code are templated, maybe you could write tests with common combinations of argument types or instead implement functional tests (or behavior-based testing) in addition to your integration tests? With regards to functional testing, maybe you could have tests that ensure that bibat generates code which guarantee certain outputs when using different configuration files, for example. I imagine that giving the user freedom to choose which configuration file to use for initialization may lead to some simple-to-catch errors in the resulting code. Overall, since the goal is to provide boilerplate code with properties you guarantee to be true, writing tests that ensure these properties in different use cases is crucial. Practically, this could include simple tests which check that classes have correct attributes, functions have the right signatures and that they expected output for particular inputs, etc... But it's probably fine as is since your integration tests check the basic properties already. In any case, the more testing you have, the easier it will be for automatically check that bibat preserves features between versions, so it's really up to your discretion where you think more testing will be most useful.
All in all, I think you've done great work and I'm excited to see the projects that people build on top of bibat! Thanks for your time and please let me know if you have any additional questions.
Checked everything on my list after managing to run the vignette successfully. I recommend approving the package (starting from version >0.1.5, that is, from the next version based on current status of main
branch).
As it has already been commented, testing can be extended, adding a 2nd case of end-to-end workflow for example, but I think it is good enough already. Given the inherent interactivity of the library and workflow, my recommendation would go more along the lines of aiming to rerun one of the vignettes manually before each minor release. (note I work mainly with visualization, so I am probably biased towards keeping a human in the loop and not relying 100% on automated checks)
Hi! Firstly, thanks to @teddygroves for the great addition to software reviewed by PyOpenSci. Also, thanks a ton @alizma and @OriolAbril for taking the time to review this and suggest helpful improvements. Here are my informal reviewer 3 notes, most of these are optional, I hope you find them helpful, but they're mostly not big enough to be strict hindrances.
I would recommend mentioning the requirements.txt or preferred python environment and such things for contributing in the contributing guide, it's not a major thing, but a good practice, not super relevant in this case as I imagine right now most people trying to contribute would have some experience because of this being a niche package, but it's nice for making it easier for newcomers as a standard practice. The tutorial on that page seems to be slightly confounding as it is a mix of using bibat for your own project versus actually contributing to core bibat code. I would also recommend conda because cmdstanpy can sometimes be a nuisance, but generally be ambivalent to provide free choice to users ofcourse!
this looks very nice!
Code of conduct link is broken.
I recommend a separate page for CLI documentation instead of putting it on the Reference page.
There's a broken link to this URL on the vignettes page, similar link here
Generally, I think the documentation still feels somewhat unclear for somebody who isn't completely invested in what the package does already, for example - prior
, posterior
modes. I understand the purpose but even looking at the code it's slightly unclear what do they exactly do within the scripts. The example report is very nice, I just think some more details about the API would help a lot!
Documenting your analysis has a typo in the line "Note that, unlike, bibat does not attempt to install or set up quarto: see quarto’s ‘getting started’ page for official installation instructions."
There are a couple of proofreading issues in the documentation, which isn't a big deal, but in case you want to submit it to other places as well, it might be nice to have a quick run through the writing.
Accessibility
I recommend using an accessibility checker on the website and looking through some issues from this report for hints
Color Contrast
Improper use of preformatted text element
Line Height
The line-height for paragraphs should be at least 1.5 times the font size.
URL colours
Thanks @mjhajharia for the latest comments. This review process has been a really fantastic learning experience for me!
The 'fix broken links' PR above addresses some of the typos and broken links.
Regarding the modes, I think this part of bibat is hard to understand at least partly because the underlying code isn't the best. I've started to address this in the 'Refactor to add...' PR. The change requires updating the documentation and vignette so it will take a bit longer to arrive but should definitely be worth it.
Thanks for suggesting using an accessibility checker - I hadn't thought of doing this. Could you possibly tell me which one you used to generate the image above (or which one you recommend in general)?
I've now done most of the things recommended, including:
That leaves improving the documentation page's accessibility with a checker. I've made an issue for this and will start working on it as soon as possible.
I've done a few things to improve the accessibility of bibat's documentation:
I've closed issue #60 from above as I think there has been an improvement but I'd still love to know the checker that @mjhajharia used as I think it noticed some things that the ones I added misssed.
Just checking here if there is something that someone is waiting for me to do.
I think almost all of the reviewers' major concerns have now been addressed. There are still some live issues that arose from the reviews but I think they are relatively minor. If any of these are particularly pressing I'd appreciate a pointer so I can prioritise fixing them!
hey @teddygroves 👋 ... pinging @mjhajharia here on this so they can check in on where the review is and provide a bit of guidance. thank you for all of the work that you've done on this package as a part of the review!!
Hi @lwasser, thanks for the ping, and apologies for the delay. Hi @teddygroves, thank you for the PRs I went through them and everything looks great, this is really fantastic and I really appreciate you taking time to incorporate the optional feedback!
Thanks for suggesting using an accessibility checker - I hadn't thought of doing this. Could you possibly tell me which one you used to generate the image above (or which one you recommend in general)?
I think you're good to go in terms of accessibility now! For that report I used SiteImprove Accessibility checker extension, I'm not sure it's the best out there but it's easy to use and get a quick overview and points out major things in case this is helpful for you later!
In general though, if you follow standard accessibility guidelines that is a good place to be in for documentation, so thanks a ton for doing that!
Again sorry for the delay, had some personal emergencies, but this looks good to go as far as I am concerned, all the necessary things are resolved and the documentation looks. I am very happy to see this page, I've been meaning to make a PR in PyOpenSci docs about this as well and I think it would be super useful.
So this is good to go as far as I'm concerned, thankyou for the fantastic project @teddygroves!
Great, thanks @mjhajharia for the checker advice (Siteimprove looks really nice) and for the review generally!
I guess it's time for someone with the right privileges to change the label?
hey there @teddygroves !! 👋 congratulations on bibat being accepted into the pyOpenSci ecosystem! I'm going to help wrap things up here. there are a few last things that need to happen! And now for the formal acceptance proceedings
🎉 bibat has been approved by pyOpenSci! Thank you @teddygroves for submitting bibat and many thanks to @OriolAbril and @alizma for reviewing this package! 😸
There are a few things left to do to wrap up this submission:
[![pyOpenSci](https://tinyurl.com/y22nb8up)](https://github.com/pyOpenSci/software-review/issues/issue-number)
.Finally - Teddy, if you are interested - we welcome a blog post that showcases bibat's functionality. this helps us promote your great work supporting the scientific Python ecosystem. This is of course optional but if you wish to do this - you can check out a few of our other blogs posts including: pandera and moving pandas that have been published. these blogs are some of our most viewed content on the website!
Please complete the final steps to wrap up this review. Editor, please do the following:
6/pyOS-approved6 🚀🚀🚀
.If you have any feedback for us about the review process please feel free to share it here. We are always looking to improve our process and documentation in the peer-review-guide.
Hi @lwasser, that's great news!
I think I've now done all the steps. The zenondo link for bibat version 0.1.10 is https://zenodo.org/record/8124312. I've added the badge and filled in the survey. The pr for updating packages.yml and contributors.yml is here: https://github.com/pyOpenSci/pyopensci.github.io/pull/216. I didn't know how to fill in some of the fields, e.g. github_image_id
- I guess we can discuss that under the pr if necessary.
I'd love to contribute a blog post - for that do I just submit a pr for a new file here?
Thanks a lot @lwasser, @mjhajharia @OriolAbril and @alizma - I really appreciated your time and attention and enjoyed the review process!
@teddygroves thank you! and welcome to pyopensci!! yes please submit your pr / blog post to that link - you can look at the moving pandas and pandera blogs to get a sense of what others have done. but there are no requirements! please just showcase what your awesome tool provides!
before this is closed i'll go through all of the checks above to ensure it's all done!! i hope to do that in the next 3 days - i'm traveling right now but will followup soon!
ok @teddygroves i'm going through this all today. your package will be live on our website by the end of the day tomorrow at the latest!
thank you everyone for supporting this review!! @mjhajharia for leading the review as editor! @OriolAbril @alizma for your reviews and @teddygroves for submitting to us and doing the work via peer review!
all - if anyone would like to join our slack group please either email me at leah at pyopensci.org OR respond here if i can grab your email from your github profile!! closing this issue now! ✨
reopening as this is going through JOSS!
this issue can be closed as JOSS did deem this review out of scope . Many thanks everyone for your work here on the pyOpenSci review side of things!! Everyone involved in this review is welcome to join our slack group. if you haven't received an invitation feel free to reach out to @kierisi !! closing this now!!
Submitting Author: Teddy Groves (@teddygroves) All current maintainers: @teddygroves Package Name: bibat One-Line Description of Package: An interactive template for Bayesian statistical analysis projects Repository Link: https://github.com/teddygroves/bibat/ Version submitted: 0.1.0 Editor: @mjhajharia
Reviewer 1: @OriolAbril
Reviewer 2: @alizma
Archive:
Version accepted: 0.1.10 Date accepted (month/day/year): 07/05/2023 JOSS DOI:
Code of Conduct & Commitment to Maintain Package
Description
Bibat is a Python package providing a flexible interactive template for Bayesian statistical analysis projects.
It aims to make it easier to create software projects that implement a Bayesian workflow that scales to arbitrarily many inter-related statistical models, data transformations, inferences and computations. Bibat also aims to promote software quality by providing a modular, automated and reproducible project that takes advantage of and integrates together the most up to date statistical software.
Bibat comes with "batteries included" in the sense that it creates a working example project, which the user can adapt so that it implements their desired analysis. We believe this style of template makes for better usability and easier testing of Bayesian workflow projects compared with the alternative approach of providing an incomplete skeleton project.
Scope
Please indicate which category or categories. Check out our package scope page to learn more about our scope. (If you are unsure of which category you fit, we suggest you make a pre-submission inquiry):
For all submissions, explain how the and why the package falls under the categories you indicated above. In your explanation, please address the following points (briefly, 1-2 sentences for each):
The target audience is people who want to write software implementing a Bayesian workflow as described in this paper and are willing to do so using Python scientific libraries, Stan, cmdstanpy, pydantic, pandera and make, as well as perhaps pytest, sphinx, quarto and github actions.
I am not aware of any interactive template that specifically targets a Python-based Bayesian workflow that scales to arbitrarily many models and data transformations.
In addition, bibat is unusual compared to other interactive templates because it is 'batteries-inclued', providing a full working example project rather than an incomplete skeleton
@tag
the editor you contacted: https://github.com/pyOpenSci/software-submission/issues/82Technical checks
For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:
Publication Options
paper.md
matching JOSS's requirements with a high-level description in the package root or ininst/
.NB I submitted an old version of this package to JOSS and it was judged to be out of scope in terms of substantial scholarly effort. See here for the relevant issue in the joss-reviews repository : https://github.com/openjournals/joss-reviews/issues/4760. The reviewers did not definitively assess whether the package was within JOSS's scope in terms of not being a "minor utility" but noted that they had a query about this.
I think that bibat in its current form is within JOSS's scope in both of these respects but I imagine they would want to assess this separately.
Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?
This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.
Confirm each of the following by checking the box.
Please fill out our survey
P.S. *Have feedback/comments about our review process? Leave a comment here
Editor and Review Templates
The [editor template can be found here][Editor Template].
The [review template can be found here][Review Template].