Closed aspeake closed 6 months ago
@trynthink pulled in the latest master updates and confirmed that the full workflow for input file updates works on this branch (after a couple of fixes).
A couple broad questions:
1. Is there any value in renaming the test files all test_*.py instead of *_test.py? (It seems like it would just avoid the need for the `-p` flag with `python3 -m unittest discover`, but maybe there are other benefits I'm not aware of? 2. Once revisions are complete, are all the commits in this PR needed? Do you want to rebase and squash/fixup some of them to group changes together logically or collapse everything into a single commit? 3. In this PR or a separate set of changes, should we segregate the modules used to update input data (e.g., cambium_updater.py, com_mseg.py) and the modules used to do an analysis (ecm_prep.py and run.py)?
We have discussed each of these, the only action planned at the moment is to implement submodules (3), for which I made an issue to be addressed in a future PR: https://github.com/trynthink/scout/issues/364
@trynthink fyi - I have updated to using setuptools_scm (https://github.com/trynthink/scout/pull/349/commits/b11c27ebaf276b14cdf835d455f6bf80a550e0ca), which replaces the need for MANFIEST.in and enables automatic versioning. Instead of requiring manual update of versions, the version is extracted from GitHub tags and inferred to a new version, if applicable.
For example, the current branch will have the following version:
Where setuptools_scm inferred v0.10 by incrementing the most recent tag, and there have since been 27 revisions since the v0.10 pre-release was started. If there have been no revisions or local changes, just the tag would be output.
This PR reorganizes the repo to define a
scout
package. This will allow for better reproducibility across machines, better organization and therefore more efficient development, and potential for re-use outside of the repo. A summary of the new structure for the repo is as follows:./scout
package contains all modules (eg, ecm_prep.py)./tests
directory contains all files related to tests./ecm_definitions
,./results
,./inputs
,./generated
files) are saved to the parent directory and not installed with the package./generated
is created to store intermediate files (such as ecm_prep.json)./inputs
directory or an appropriate folder within./scout/supporting_data
With changes to filepaths, steps were taken to generalize how paths are defined for reading and writing throughout the repo. Specifically, a new class
scout.constants.FilePaths
was created to house the most commonly used paths, and that is now incorporated in many of the modules using thepathlib
package. This results in cleaner code in many instances and a more general way to make updates.