nf-core / eager

A fully reproducible and state-of-the-art ancient DNA analysis pipeline
https://nf-co.re/eager
MIT License
148 stars 82 forks source link

Flexible trimming by library strandedness. #805

Closed TCLamnidis closed 2 years ago

TCLamnidis commented 2 years ago

Is your feature request related to a problem? Please describe

Allowing for different trimming based on library strandedness. pileupCaller allows us to use untrimmed data for ssDNA libraries without damage affecting the genotypes. That means that ssDNA non-UDG libraries do not need trimming, and can be run with --bamutils_clip_none_udg_{left,right}=0. However, that becomes a problem when ssDNA and dsDNA non-UDG libraries are combined in a single run, since when genotyping from untrimmed data for the latter damage will heavily affect the genotypes.

Describe the solution you'd like

Add library construction specific trimming options, e.g. bamutils_clip_{single,double}_{none,half}_udg_{left,right}. This would allow trimming flexibility and let users utilise a single set of parameters across different types of data.

Describe alternatives you've considered

It is possible to create separate runs for dsDNA and ssDNA libraries, but that sounds less elegant a solution, especially for applications that are potentially (semi-)automated.