fls-bioinformatics-core / auto_process_ngs

Scripts and utilities for automatic processing & management of Illumina NGS sequencing data.
Other
9 stars 6 forks source link

Implement run reference ID distinct from run ID #852

Closed pjbriggs closed 1 year ago

pjbriggs commented 1 year ago

Implements run reference IDs which are similar to but distinct from the existing run IDs, and makes the distinction more explicit in the code and documentation.

Run IDs (previously also called run references or run reference IDs interchangeably) are IDs of the form e.g.

NOVASEQ6000_230419#74

These are now always referred to as run IDs. They are generated by the run_id function in the analysis module and returned by the run_id method of the AutoProcess class in the auto_processor module. Run IDs are used within the library to locate other runs for example when looking for paired 10x single cell multiome datasets.

Run reference IDs are based on the run ID with arbitrary additional data items appended (currently only the flow cell mode, if set), e.g.

NOVASEQ6000_230419#74_SP

These are returned by the run_reference_id method of the AutoProcess class. Reference IDs are externally facing IDs and are reported e.g. in the "summary" output from the report command.