Closed RodolfoSalido closed 5 months ago
I looked this over in detail and I should first clarify for the casual reader that the _validate_sample_sheet_metadata() method is an internal method called by make_sample_sheet() and not by the user themselves. Specifically, this method checks the metadata dictionary input passed by the user in make_sample_sheet(), usually from within a notebook, for completeness. This is so a bad input isn't directly passed to the third-party Sheet() class and raises a cryptic Error to the user. Although the naming is very generalized since we don't want to name functions '..._metadata_dict()' and the like, this is an apples vs oranges comparison.
Technically nothing has changed from previous versions. When we used to use KLSampleSheet(file_path) to open a file and load the contents into a KLSampleSheet() object, we needed to validate that it was correct as a second step by calling validate_and_scrub_sample_sheet(sheet). (Technically this has changed to sheet = load_sample_sheet(file_path) and sheet.validate_and_scrub_sample_sheet() but this is besides the point.)
The same is true of make_sample_sheet(). Technically, the user should have called validate_and_scrub_sample_sheet() afterward to ensure that it's correct and report any Errors if not.
However, the docstring for make_sample_sheet() clearly states that it, 'Write[s] a valid sample sheet'. I think the wetlab is very reasonable in expecting that any returned SampleSheet() object is valid. The simple fix is to simply have make_sample_sheet() call validate_and_scrub_sample_sheet() before returning. I will issue a PR for this.
https://github.com/biocore/metagenomics_pooling_notebook/blob/a2f5ba44463f5db87bab5d8a6d0f84c09b44e740/metapool/sample_sheet.py#L1041C16-L1041C16
_validate_sample_sheet_metadata()
doesn't catch errors thatvalidate_and_scrub_sample_sheet()
does.