Closed sagarc88 closed 5 years ago
Sorry for the inconvenience. This bug has been fixed. Please grab the latest code. Let me know if you run into any other issues!
Thank you for looking into this so quickly! I am running into another error and hope you can help..
03/26/2019 07:48:20 PM WARNING Warning from the following filter command python mergeRnaAndDnaFiles.py PT1 1 test/filtered/PT1_mpileup_dna_origin_chr1.vcf test/filtered/PT1_mpileup_rna_origin_chr1.vcf test/filtered/PT1_overlap_chr1.vcf test/filtered/PT1_blatFiltered_chr1.vcf test/filtered/PT1_merged_chr1.vcf --dnaHeaderOnly
and it prints out a lot of VCF entries along with the following error:
Traceback (most recent call last):
File "mergeRnaAndDnaFiles.py", line 521, in <module>
main()
File "mergeRnaAndDnaFiles.py", line 494, in main
(headerList, coordinateDict) = merge_vcf_data(i_dnaFilename, i_rnaFilename, i_overlapsFilename, i_nonOverlapsFilename, i_dnaHeaderOnly, i_debug)
File "mergeRnaAndDnaFiles.py", line 386, in merge_vcf_data
dnaLine = coordinateDict[rnaStopCoordinate]
KeyError: '70705199'
Thank you for looking into this so quickly! I am running into another error and hope you can help..
03/26/2019 07:48:20 PM WARNING Warning from the following filter command python mergeRnaAndDnaFiles.py PT1 1 test/filtered/PT1_mpileup_dna_origin_chr1.vcf test/filtered/PT1_mpileup_rna_origin_chr1.vcf test/filtered/PT1_overlap_chr1.vcf test/filtered/PT1_blatFiltered_chr1.vcf test/filtered/PT1_merged_chr1.vcf --dnaHeaderOnly
and it prints out a lot of VCF entries along with the following error:
Traceback (most recent call last): File "mergeRnaAndDnaFiles.py", line 521, in <module> main() File "mergeRnaAndDnaFiles.py", line 494, in main (headerList, coordinateDict) = merge_vcf_data(i_dnaFilename, i_rnaFilename, i_overlapsFilename, i_nonOverlapsFilename, i_dnaHeaderOnly, i_debug) File "mergeRnaAndDnaFiles.py", line 386, in merge_vcf_data dnaLine = coordinateDict[rnaStopCoordinate] KeyError: '70705199'
Upon further investigation, it looks like the problem comes from --rnaOnly flag when running filterRadia.py. This leads to a --dnaHeaderOnly flag when running mergeRnaAndDnaFiles.py. With this flag, it never adds anything from the "aDnaFile" to the coordinateDict and the lookup of a key somehow fails. When running the above mergeRnaAndDnaFiles.py command without --dnaHeaderOnly flag, it completes without any problems.
I have fixed the bug with the KeyError that you mention here, but I would actually recommend that you do not use the --rnaOnly
flag. The --dnaOnly
and --rnaOnly
flags basically only control which filters are applied. Here is an explanation of the flags:
--dnaOnly
:
No matter what data is provided, only apply the DNA filters. When RADIA was initially developed back in 2010, it was important to provide a flag to provide DNA based calls that could be compared to other DNA mutation calling algorithms that did not include RNA.
--rnaOnly
:
No matter what data is provided, only apply the RNA filters. This option was provided as the opposite of the --dnaOnly
flag, but it is misleading. For example, the result of supplying the --rnaOnly
flag means that there will be no passing germline calls, because the DNA filters were not applied. Also, all of the non-passing somatic calls will only have the RNA filters instead of the combined DNA + RNA filters.
I believe what you want is to apply all DNA and RNA filters and then extract the RNA specific calls (RNA Confirmation, RNA Rescue, and RNA Editing). You can do this by leaving the --dnaOnly
and
--rnaOnly
flags set to their defaults, and then inspecting the final calls. All of the calls are labeled in the INFO field as follows:
Sorry for the confusion. Let me know if you have any other issues or questions.
As I stated in my last comment, the result of supplying the --rnaOnly
flag was misleading. In the latest commit, I disabled the --rnaOnly
flag. As stated above, by default, when merging the RNA and DNA results, we want to provide as much information as possible.
Note: I also updated the dbSnp files from version 150 to 151, and I have officially tagged this latest version of RADIA to v1.1.4. Let me know if you find any other issues. Thanks!
Ok perfect! THank you very much for all the info!
Getting following error while running filterRadia.py
Upon further investigation, it looks like the written Read ID in createBlatFile.py (line 488) was reformatted during commit ID: #05c069f and the "Readlength" and "position" fields were removed. However, the filterByBlat.py file was never changed to account for these missing values in the read ID.