biocore / metagenomics_pooling_notebook

Jupyter notebooks to assist with sample processing
MIT License
8 stars 16 forks source link

Added function to parse a sample-sheet into dataframes. #245

Closed charles-cowart closed 5 hours ago

charles-cowart commented 2 weeks ago

Extended _parse_header() to parse all sections of a sample-sheet into dataframes and return them to the caller. This will make it easier and more robust to modify values like 'Lane' on-demand. This is part of an effort to potentially replace the third-party sample_sheet module and parsing method. Lane is a field that is currently read-only once read into the base SampleSheet() object. A robust method to change the values of an existing column or add a column if it's not present is needed.

This is a work-in-progress. I need a more robust method to overwrite the values for Lane in a sample-sheet to support the tellseq code and the plugin refactoring.

charles-cowart commented 2 weeks ago

tests are still needed.

charles-cowart commented 4 days ago

extract_data_from_sheet() can be removed, as we don't need it anymore. will wait until Amanda's PR is merged before resolving this PR, as some of the changes overlap e.g. TellSeq SampleSheet() implementations. I don't need to keep mine as they are barebones. Will want to keep and merge the DFSheet() and set_lane_in_sample_sheet() code. Tests will be needed for this code.

charles-cowart commented 1 day ago

Amanda's latest merged into master and duplicate code in this PR removed. Flake8 needed.

charles-cowart commented 1 day ago

@AmandaBirmingham I merged your branch in and made the changes to my own branch to have it pass all tests. I think Daniel's comments do have merit and so I was thinking it would be good to focus on whether the new functionality here should be extended or changed and the sheets we will need for end-to-end testing and whatever else we need.

charles-cowart commented 5 hours ago

I'm closing this without merging. If we want to use DFSheet() in the future it will be here.