rki-mf1 / clean

A nextflow pipeline for decontamination of short reads, long reads and contigs
BSD 3-Clause "New" or "Revised" License
30 stars 3 forks source link

Summary file for number and proportion of mapped and unmapped reads #105

Open ayoraind opened 1 week ago

ayoraind commented 1 week ago

Hi again,

I would like to ask if provision could be made to have a summary file containing all (or most of the) relevant information present in the 'sorted.flagstats.txt' files. That is, something like this

Filename    Total_no_reads  mapped_reads    unmapped_reads  proportion_mapped (%)   proportion_unmapped (%) etc
X                  100            90               10                       90                10
Y                  200            100              100                      50                50                              

I am aware that this information is somewhat visible on the multiQC.html file, and one can manually check files within the intermediate/map-to-remove directory to retrieve this information. However, I was hoping to have a summary file that can be plotted in R, python, or any chosen medium.

ayoraind commented 6 days ago

Hi @MarieLataretu and @hoelzer,

I found a way out. Further information can be found here (https://github.com/rki-mf1/clean/compare/main...ayoraind:clean:main). If this is okay with you, I could create a pull request. If not, this is also fine. Many thanks for this incredibly useful pipeline.

hoelzer commented 6 days ago

Hey @ayoraind that's awesome, thanks! Yes, please, just go ahead and make a PR. Then we can comment on that and check before integrating.