Closed charles-cowart closed 7 months ago
Totals | |
---|---|
Change from base Build 7430826706: | 0.04% |
Covered Lines: | 2141 |
Relevant Lines: | 2455 |
@wasade I'm hesitant to change how we determine the type and ID since everyone seems to remember this method and it does appear durable. However I can start pulling examples of xml files from across the runs to see if we can pull it reliably from there if you guys think it's worth it. We don't really have a tool to parse these and that may be a good thing to have yes?
Some googling suggests the instrument lookup may not be universally correct, and presumably Illumina or IGM has authoritative information. For regex, doesn't this work?
>>> import re
>>> matcher = re.compile(r'(\d{6,8})_([A-Z0-9]+)_(\d+)_([A-Z0-9]+)')
>>> matcher.search('231201_A01535_0431_BHVKWCDSX7').groups()
('231201', 'A01535', '0431', 'BHVKWCDSX7')
Superceded by https://github.com/biocore/mg-scripts/pull/123
Addresses https://github.com/biocore/mg-scripts/issues/106