Implemented a study loader that is the loader class equivalent of the old load_study.py, but it runs all the new loaders instead.
I haven't implemented a command line script yet. Not sure if I will. RN, I'm focussed on the user experience of the submission process. That involves everything in the study sheet and all the peak annotation files. It doesn't support mzXML files, nor should it. users cannot upload those. But the msruns loader can be run at any time to load them, so that's not a big deal. Once we're at the point where users can easily create their submissions, then I can focus on the differences we want for ourselves during the load process (e.g. add new compounds or whatever to consolidated files). So the next effort is to update the excel doc that's returned in the build a submission web interface to flesh out all the associated data that can be derived from a submitted peak annotations file.
I overloaded TableLoader here, which is why I removed the model requirement in the class attributes of TableLoader. Instead of "headers", it sets sheet names. I didn't go to the trouble of renaming the variables used by TableLoader, so just be aware during review that references to headers are actually sheets. I try to make that clearer with comments and wrapper methods, but the attributes will look confusing.
There was just too much benefit to using TableLoader to not use it in this loader.
One thing to note is, load_study.py allowed MaintainedModel's coordinator to be overridden by intercepting the kwargs and looking for argument names that allow you to change the mode of the MaintainedModelCoordinator. For example, you don't want to run buffered autoupdates if you're in dry run or validate mode, so it used a wrapper to inspect the arguments coming in and dynamically change it's mode based on those arguments. That worked fine for decorating the handle method of a command line script, but the load_data method inside TableLoader doesn't have any arguments other than self, so I added a way for it to see if self has an attribute by configurable names (in the application of the decorator - using the same mechanism) to check its value and thereby affect the coordinator mode (e.g. disable deferred updates in dry run or validate mode.
PS - There also needs to be a refactor that removes all the old loaders and scripts once the process migrates to the new pipeline.
Affected Issues/Pull Requests
Partially addresses #839
Merges into #1012
Next PR: #1019
Review Notes
See comments in-line.
Checklist
This pull request will be merged once the following requirements are met. The
author and/or reviewers should uncheck any unmet requirements:
Review requirements
Minimum approvals: 1
No changes requested
All blocking issues resolved by reviewers
Specific reviewers: @add_username_here
Review period: 2 days
Associated issue/pull request requirements:
[x] All requirements in affected issues marked "resolved" are satisfied
[x] All required pull requests are merged (or none)
Summary Change Description
Implemented a study loader that is the loader class equivalent of the old
load_study.py
, but it runs all the new loaders instead.I haven't implemented a command line script yet. Not sure if I will. RN, I'm focussed on the user experience of the submission process. That involves everything in the study sheet and all the peak annotation files. It doesn't support mzXML files, nor should it. users cannot upload those. But the msruns loader can be run at any time to load them, so that's not a big deal. Once we're at the point where users can easily create their submissions, then I can focus on the differences we want for ourselves during the load process (e.g. add new compounds or whatever to consolidated files). So the next effort is to update the excel doc that's returned in the build a submission web interface to flesh out all the associated data that can be derived from a submitted peak annotations file.
I overloaded
TableLoader
here, which is why I removed the model requirement in the class attributes ofTableLoader
. Instead of "headers", it sets sheet names. I didn't go to the trouble of renaming the variables used byTableLoader
, so just be aware during review that references toheaders
are actuallysheets
. I try to make that clearer with comments and wrapper methods, but the attributes will look confusing.There was just too much benefit to using
TableLoader
to not use it in this loader.One thing to note is,
load_study.py
allowedMaintainedModel
's coordinator to be overridden by intercepting thekwargs
and looking for argument names that allow you to change the mode of theMaintainedModelCoordinator
. For example, you don't want to run buffered autoupdates if you're in dry run or validate mode, so it used a wrapper to inspect the arguments coming in and dynamically change it's mode based on those arguments. That worked fine for decorating the handle method of a command line script, but theload_data
method insideTableLoader
doesn't have any arguments other thanself
, so I added a way for it to see ifself
has an attribute by configurable names (in the application of the decorator - using the same mechanism) to check its value and thereby affect the coordinator mode (e.g. disable deferred updates in dry run or validate mode.PS - There also needs to be a refactor that removes all the old loaders and scripts once the process migrates to the new pipeline.
Affected Issues/Pull Requests
Review Notes
See comments in-line.
Checklist
This pull request will be merged once the following requirements are met. The author and/or reviewers should uncheck any unmet requirements:
changelog.md
(or no change)