Open dpark01 opened 1 month ago
It seems that all NextSeq 2000 run directories have a file called RunParameters.xml
with various helpful values, including InstrumentType
, so we may not need to resort to regex matching to sleuth out the model of newer sequencers. Ex.:
<InstrumentType>NextSeq 2000</InstrumentType>
We can obtain that value directly in Python like this:
python3 -c "import xml.etree.ElementTree as ET; tree = ET.parse('RunParameters.xml'); root = tree.getroot(); print(root.find('.//InstrumentType').text)"
(perhaps falling back to the old regex approach if the RunParameters.xml
file does not exist)
Example of other values that may be interesting to parse out and/or use:
<FlowCellLotNumber>20688106</FlowCellLotNumber>
<FlowCellExpirationDate>2023-09-03</FlowCellExpirationDate>
<FlowCellVersion>2</FlowCellVersion>
<FlowCellMode>NextSeq 1000/2000 P2 Flow Cell Cartridge</FlowCellMode>
<CartridgeSerialNumber>EC1194950-EC11</CartridgeSerialNumber>
<CartridgePartNumber>20044466</CartridgePartNumber>
<CartridgeLotNumber>20668878</CartridgeLotNumber>
<CartridgeExpirationDate>2023-08-28</CartridgeExpirationDate>
<CartridgeVersion>3</CartridgeVersion>
<CartridgeMode>NextSeq 1000/2000 P2 Reagent Cartridge (338 Cycles)</CartridgeMode>
I'm curious if we can use CartridgeLotNumber
to find any lot-related effects in the data, or if we can relate any data quality metrics to CartridgeExpirationDate
.
As of 2024,
illumina_demux
's sequencer model emitted in its runinfo.json output is failing to infer the sequencer from recent NextSeq 2000 runs (not sure if they're XLEAP kits or just normal ones) and instead just emittingUNKNOWN
. Probably just need to update the heuristics and tables here. Observed behavior both at Broad and ACEGID.