There should never be mutations that are in both data_mutations.txt and data_mutations_uncalled.txt. The uncalled profile contains all mutations that weren't called, so it is mutually exclusive with the file that contains called mutations. It would be nice to have a check in the validator that confirms this. The logic should be something like:
check if there exists a mutation file data_mutations_uncalled.txt. If so, there should never be duplicates when you combine it with data_mutations.txt
indicate which variants are duplicates in the validation report
There should never be mutations that are in both
data_mutations.txt
anddata_mutations_uncalled.txt
. The uncalled profile contains all mutations that weren't called, so it is mutually exclusive with the file that contains called mutations. It would be nice to have a check in the validator that confirms this. The logic should be something like:data_mutations_uncalled.txt
. If so, there should never be duplicates when you combine it withdata_mutations.txt
For test files you can use:
Follow up ticket to: https://github.com/cBioPortal/cbioportal/issues/10126