nf-core / seqinspector

QC pipeline to inspect your sequences
https://nf-co.re/seqinspector
MIT License
4 stars 14 forks source link

Display sequencing metadata from the run_dir #4

Open matrulda opened 6 months ago

matrulda commented 6 months ago

Description of feature

It would be nice if metadata regarding the sequencing run was displayed in the reports. In the case of an Illumina run this could include:

This should be data that can be scraped from sequencing output files, like RunInfo.xml and RunParameters.xml.

This information could be shown in its own section in the reports or be part of the Software versions table.

This is how this feature has been implemented in seqreports: https://github.com/Molmed/seqreports/blob/main/bin/get_metadata.py

mahesh-panchal commented 6 months ago

A possible method that might help with speed is to implement this natively in Groovy. Processes using exec are executed on the head node and don't need to be distributed to a node like a python script does.

The XML slurper lib should make it easy-ish to parse out the relevant meta data and add it to the meta map. https://groovy-lang.org/processing-xml.html#_simply_traversing_the_tree

matrulda commented 6 months ago

Good point, something to keep in mind.