updated load_sample_sheet().

biocore / metagenomics_pooling_notebook

Jupyter notebooks to assist with sample processing

MIT License

8 stars 16 forks source link

Updated load_sample_sheet() to determine the proper SampleSheet() child class to load a file into based on its Assay type, its SheetType, and SheetVersion. If a child class cannot be assigned load_sample_sheet() will continue to raise the generic 'invalid-sample-sheet' message. Otherwise it will return a sheet object and the user will be responsible for running validate_and_scrub() methods and assessing any error messages.

The legacy _parse() method for sample_sheets is perhaps a little cryptic and relies on the csv package. I experimented with implementing a separate parse_header() function using pandas and the read_csv method since the lab is more familiar with it. I believe it works pretty well and it doesn't rely on any legacy functionality in the third party 'sample_sheet' package we appear to be using for _parse(). This might make it easier to move off said package in the future.

the types list inside of _parse_header() seems like it would be better defined at the top of the file; however, python doesn't appear to parse the entire file before evaluating such a definition and hence such a list will be full of undefined classes. Looking for input from reviewers.

biocore / metagenomics_pooling_notebook

updated load_sample_sheet(). #243

@.**** commented on this pull request.

column and raise an Error if not. By convention it should be, once

legacy comments and whitespace rows are removed.