mskcc / tempo

CCS research pipeline to process WES and WGS TN pairs
12 stars 5 forks source link

add cdna contamination calculation #927

Closed anoronh4 closed 2 years ago

anoronh4 commented 3 years ago

fixes #926 (also related to closed issue #393)

the vcf2maf container was edited. the pandas package had to be installed with pip, and custom scripts were removed from the container and placed in workflow.projectDir + "/lib/scripts". the custom script for calculating cdna contamination was also introduced in this folder.

As of the initial commit, the custom script was directly copied from the docker image used in the argos-cwl repo. The script has to be edited to credit the original authors, and we may want to edit the script to better suit our purposes.

At some point we would add this information to the somatic level multiqc report. we need to discuss whether to add that in this PR, or in #923, or in a later PR.

gongyixiao commented 2 years ago

I think putting custom script in lib/scripts will make the scripts hard to organize. Imaging implementing this change to all the scripts the current Docker container includes, everything will end up in the same directory and it will be hard for the user to grab and use them when they want to adopt one Docker container and run that process by themself. So I would maybe put it in the same directory as the Dockerfile for each process, and just not copying them into the container and use workflow.projectDir/containers/${tools}/.

Happy to discuss more.