ShawHahnLab / umbra

Python package and executable for Linux for managing Illumina sequencing runs
GNU Affero General Public License v3.0
3 stars 0 forks source link

Duplicate sample name case should be handled #50

Open ressy opened 5 years ago

ressy commented 5 years ago

The sequencers do allow a sample sheet to specify the same name for multiple samples, and currently this results in all but the first sample of a given name being "lost" during processing. How should this be handled? Based on Illumina's approach it should probably be supported, weird as it is. Maybe we should use sample number as the basis for tracking samples instead.

ressy commented 4 years ago

Still needs fixing, and using sample numbers would also sidestep the problem that the sequencer's onboard software and bcl2fastq have different naming conventions. Note that that Sample_ID in the sample sheet is not the same thing as sample number, and the output fastq.gz files appear to always be numbered sequentially.