biocore / metagenomics_pooling_notebook

Jupyter notebooks to assist with sample processing
MIT License
8 stars 16 forks source link

refactored code into class objects #157

Closed charles-cowart closed 7 months ago

charles-cowart commented 7 months ago

Refactoring code into class objects so that each version of a sample-sheet can have its own list of requirements. This is based on master because the code is more familiar but it can be manually merged after into dev.

antgonza commented 7 months ago

There are CI errors here, could you pull from upstream to see if that fixes them?

charles-cowart commented 7 months ago

There are CI errors here, could you pull from upstream to see if that fixes them?

Sorry about the errors - I know what's causing them - a function that used to be global that I moved into KLSampleSheet as a method. There are tests that still test the global function that I need to modify. This is still a work in progress but I will try to keep the tests current with the code changes.

charles-cowart commented 7 months ago

@AmandaBirmingham @antgonza @wasade It's ready! Please hammer away at your convenience. There is one TODO I've marked as something I need to look into tomorrow. I also can stand to corroborate the existing sample-sheets against the examples provided in the spreadsheet. I believe the existing sample-sheets don't have SheetVersion and SheetType and that is the only glaring omission.

charles-cowart commented 7 months ago

Thanks, @charles-cowart. I think this would benefit a lot from placing string literals in variables. Many of the functions are also quite complex, and which look like they could be decomposed to discrete and easily testable units. There is what looks like violation of public/private encapsulation that I think suggests a reconsideration of the interaction with the sheets. And there were a few places, though I don't think I caught them all, where mutable types were being used when I believe immutable is the expectation. And it also looks like knowledge of subclasses is encoded in the baseclass. I tried to do what I could within suggested changes.

Thanks for the review @wasade! Many of these are actually legacy functions that were just moved into the KLSampleSheet() class scope. Now that all of the tests have been updated and are successful, I can refactor them for clarity. I agree it would be nice if the _add_metadata() and _add_data() methods were either moved back out into the package scope or modified to separate out functionality specific to amplicon.