Closed fbdtemme closed 1 year ago
Small update on progress here: was working on adding a mitochondria mode for mutect2 and on a hunch ran the TN test using the --mitochondira-mode argument (which increases sensitivity), when I did we got some variants in the vcf. This doesn't fix the issue, but I think it at least confirms the problem is that the test data lacks any significant TN variants as was discussed yesterday. So the good news is that the mutect2 module is working, bad news is we will need some different test data after all.
😠thanks for the detective work. This is going to be a major pain. We essentially need test data then that is UMI tagged, preferably on chr22 (or ideally on chr6 for hlatyping tools, but that is even more work) and has enough variants. The UMI tagging thing might make simulating more difficult. I don't think the one I shared with you yesterday is doing that. Another option would be to keep the umi tagged reads as a separate entity and find/simulate a complete new set of reads covering the above constraints.
@FriederikeHanssen are the recal bams in the Sarek test data directory the same as in the modules? Was thinking maybe I could try that and hope anything shows up if it is different.
no these are complete separate things. I can't remember now actually why we didn't choose the sarek test-data in the end and port it modules/test-data 🤔 . BUt yes to track down the issue, you could definitely try that
Possible modules with broken test data:
As discussed in gather.town, we will proceed by first adding all sarek/raredisease modules (as these two pipelines are probably most effected) and then update the test data. Otherwise we might continue running into the same problem over and over again having to also update all upstream modules. Please add modules that are effected to this issue. we started a collection above
This is fixed, right?
I haven't had anything to do with the ASCAT or ControlFREEC modules, but all the others should be using the new datasets now. So should be generating files that aren't empty now
Hi there!
Looks like the issue is solved. Are you still planning to check if all module tests are working fine? If not, you can ignore this message and we’ll close your issue in about 2 weeks. If you think this is still relevant, you can also add it to the hackathon2023 project board.
Cheers the nf-core maintainers
When running the mutect2 test for tumor-normal analysis the test runs fine, but an empty VCF file is generated. This causes issues downstream when trying the use the output of GATK4_MUTECT2 in for example GATK4_MERGEVCFS.
It would be great to have a test dataset that produces actual results so downstream tools can rely on that data as well.
Steps to reproduce:
Inspect VCF file generated by GATK4_MUTECT2...