TrinityCTAT / ctat-mutations

Mutation detection using GATK4 best practices and latest RNA editing filters resources. Works with both Hg38 and Hg19
https://github.com/TrinityCTAT/ctat-mutations
Other
71 stars 19 forks source link

error on testing set #94

Open ConcettaDe4 opened 3 years ago

ConcettaDe4 commented 3 years ago

Hi! I am testing the pipeline on the testing set using the singularity image ctat_mutations.v3.0.1.simg. When I ran the comman I got the following error at the end of the pipeline:

[2021-05-25 17:25:23,66] [info] Stream materializer shut down
[2021-05-25 17:25:23,66] [info] WDL HTTP import resolver closed
Workflow c3dfc3db-0ca1-4f5f-b729-a7dcfa686826 transitioned to state Failed
INFO:    Cleaning up image.

In particular when I checked the standard error in the folder call-AnnotateVariants/execution I had the following error:


################################
 Annotating VCF: Calculating DJ
################################

17:18:56 : INFO : Sorting VCF
17:18:57 : INFO : Loading input VCF
17:19:27 : INFO : Running closestBed
17:19:27 : INFO : Generating Distances
17:19:27 : INFO : CMD: bcftools sort test.DJ.vcf -o test.DJ.vcf
Writing to /tmp/bcftools-sort.zjLcIB
Merging 1 temporary files
Cleaning
Done
17:19:28 : INFO :
################################
 Annotating VCF: Calculating ED
################################

17:19:28 : INFO : Processing VCF Positions
17:19:28 : INFO : Running samtools faidx
17:19:28 : INFO : Running Blat
17:21:20 : INFO : Processing Output
17:21:20 : INFO : Creating ED features
17:21:20 : INFO : Outputing the annotated VCF.
[W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.'
[W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.'
[W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.'
[W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.'
[W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.'
[W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.'
[W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.'
[W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.'
[W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.'
[W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.'
[W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.'
[W::vcf_parse] INFO 'minute_gastric_sclerosing_stromal_tumour' is not defined in the header, assuming Type=String
[W::vcf_parse] INFO 'gross_ICC_hyperplasia -- NS' is not defined in the header, assuming Type=String
Traceback (most recent call last):
  File "/opt/conda/bin/oc", line 5, in <module>
    from cravat.oc import main
  File "/opt/conda/lib/python3.7/site-packages/cravat/oc.py", line 2, in <module>
    from cravat import cravat_admin, cravat_util
  File "/opt/conda/lib/python3.7/site-packages/cravat/cravat_admin.py", line 451, in <module>
    au.ready_resolution_console()
  File "/opt/conda/lib/python3.7/site-packages/cravat/admin_util.py", line 1328, in ready_resolution_console
    new_md = input(msg)
EOFError: EOF when reading a line

CMD: annotate_with_cravat: oc run test.cosmic.vcf.gz --module-option vcfreporter.type=separate --system-option modules_dir=/home/stefa
nia/RNAseq_project/trinity_cancer_transcriptome_toolkit/GRCh38_gencode_v33_CTAT_lib_Apr062020.plug-n-play/ctat_genome_lib_build_dir/ct
at_mutation_lib/cravat -t vcf -l hg38 -d  -n test.cravat.tmp

Traceback (most recent call last):
  File "/usr/local/src/ctat-mutations/src/annotate_with_cravat.py", line 61, in <module>
    subprocess.check_call(cravat_cmd)
  File "/opt/conda/lib/python3.7/subprocess.py", line 347, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['oc', 'run', 'test.cosmic.vcf.gz', '--module-option', 'vcfreporter.type=separate', '--system-
option', 'modules_dir=/home/stefania/RNAseq_project/trinity_cancer_transcriptome_toolkit/GRCh38_gencode_v33_CTAT_lib_Apr062020.plug-n-
play/ctat_genome_lib_build_dir/ctat_mutation_lib/cravat', '-t', 'vcf', '-l', 'hg38', '-d', '', '-n', 'test.cravat.tmp']' returned non-
zero exit status 1.

I noticed that I did not have the final files of variant calling. How can I fix this error?

Thank you for your help!

Concetta

brianjohnhaas commented 3 years ago

Hi Concetta,

Can you try including option: --no_cravat

and let's see if that allows it to go through?

Also, if you send me your file: test.cosmic.vcf.gz

I can look into what the issue is exactly.

@.***

thx,

~brian

On Tue, May 25, 2021 at 5:01 PM ConcettaDe4 @.***> wrote:

Hi! I am testing the pipeline on the testing set using the singularity image ctat_mutations.v3.0.1.simg. When I ran the comman I got the following error at the end of the pipeline:

[2021-05-25 17:25:23,66] [info] Stream materializer shut down [2021-05-25 17:25:23,66] [info] WDL HTTP import resolver closed Workflow c3dfc3db-0ca1-4f5f-b729-a7dcfa686826 transitioned to state Failed INFO: Cleaning up image.

In particular when I checked the standard error in the folder call-AnnotateVariants/execution I had the following error:

################################ Annotating VCF: Calculating DJ ################################

17:18:56 : INFO : Sorting VCF 17:18:57 : INFO : Loading input VCF 17:19:27 : INFO : Running closestBed 17:19:27 : INFO : Generating Distances 17:19:27 : INFO : CMD: bcftools sort test.DJ.vcf -o test.DJ.vcf Writing to /tmp/bcftools-sort.zjLcIB Merging 1 temporary files Cleaning Done 17:19:28 : INFO : ################################ Annotating VCF: Calculating ED ################################

17:19:28 : INFO : Processing VCF Positions 17:19:28 : INFO : Running samtools faidx 17:19:28 : INFO : Running Blat 17:21:20 : INFO : Processing Output 17:21:20 : INFO : Creating ED features 17:21:20 : INFO : Outputing the annotated VCF. [W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.' [W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.' [W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.' [W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.' [W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.' [W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.' [W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.' [W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.' [W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.' [W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.' [W::bcf_hdr_register_hrec] An INFO field has no Number defined. Assuming '.' [W::vcf_parse] INFO 'minute_gastric_sclerosing_stromal_tumour' is not defined in the header, assuming Type=String [W::vcf_parse] INFO 'gross_ICC_hyperplasia -- NS' is not defined in the header, assuming Type=String Traceback (most recent call last): File "/opt/conda/bin/oc", line 5, in from cravat.oc import main File "/opt/conda/lib/python3.7/site-packages/cravat/oc.py", line 2, in from cravat import cravat_admin, cravat_util File "/opt/conda/lib/python3.7/site-packages/cravat/cravat_admin.py", line 451, in au.ready_resolution_console() File "/opt/conda/lib/python3.7/site-packages/cravat/admin_util.py", line 1328, in ready_resolution_console new_md = input(msg) EOFError: EOF when reading a line

CMD: annotate_with_cravat: oc run test.cosmic.vcf.gz --module-option vcfreporter.type=separate --system-option modules_dir=/home/stefa nia/RNAseq_project/trinity_cancer_transcriptome_toolkit/GRCh38_gencode_v33_CTAT_lib_Apr062020.plug-n-play/ctat_genome_lib_build_dir/ct at_mutation_lib/cravat -t vcf -l hg38 -d -n test.cravat.tmp

Traceback (most recent call last): File "/usr/local/src/ctat-mutations/src/annotate_with_cravat.py", line 61, in subprocess.check_call(cravat_cmd) File "/opt/conda/lib/python3.7/subprocess.py", line 347, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['oc', 'run', 'test.cosmic.vcf.gz', '--module-option', 'vcfreporter.type=separate', '--system- option', 'modules_dir=/home/stefania/RNAseq_project/trinity_cancer_transcriptome_toolkit/GRCh38_gencode_v33_CTAT_lib_Apr062020.plug-n- play/ctat_genome_lib_build_dir/ctat_mutation_lib/cravat', '-t', 'vcf', '-l', 'hg38', '-d', '', '-n', 'test.cravat.tmp']' returned non- zero exit status 1.

I noticed that I did not have the final files of variant calling. How can I fix this error?

Thank you for your help!

Concetta

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/NCIP/ctat-mutations/issues/94, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRKX26BK73ENTKCDVNY7LTPQF4NANCNFSM45QGPKPQ .

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas http://broad.mit.edu/~bhaas

ConcettaDe4 commented 3 years ago

Hi! Thank you for your reply. I re-run the command with the option --no_cravat. Here you can download the file test.cosmic.vcf.gz https://we.tl/t-6I78SzF23M .

Thanks,

Concetta

brianjohnhaas commented 3 years ago

thanks! I'll look further into the cravat issue here. more later.

On Tue, May 25, 2021 at 5:43 PM ConcettaDe4 @.***> wrote:

Hi! Thank you for your reply. I re-run the command with the option --no_cravat. Here you can download the file test.cosmic.vcf.gz https://we.tl/t-6I78SzF23M http://url .

Thanks,

Concetta

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/NCIP/ctat-mutations/issues/94#issuecomment-848286198, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRKX3FCOJUYWPPQG5NRQDTPQKZDANCNFSM45QGPKPQ .

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas http://broad.mit.edu/~bhaas

ConcettaDe4 commented 3 years ago

Hi! I have run the command with --no_cravat and I did not get error. At the end in the folder I had the following files:

(singularity) :~/RNAseq_project/trinity_cancer_transcriptome_toolkit/test_1$ ls
cromwell-executions     test.bqsr.bam                                    test.GBoost-classifier.vcf.gz            test.vcf.gz
cromwell-workflow-logs  test.GBoost-classifier.cancer.igvjs_viewer.html  test.star.Aligned.sortedByCoord.out.bam
test.annotated.vcf.gz   test.GBoost-classifier.cancer.tsv                test.star.Log.final.out
test.bqsr.bai           test.GBoost-classifier.cancer.vcf                test.star.SJ.out.tab

I think that now it is OK. Is that right?

Thank you,

Concetta

brianjohnhaas commented 3 years ago

looks right wrt the output files present.

Note, if you're not running a whole transcriptome through, then I wouldn't trust the GBoost* output files as being 'best'. The 'test.annotated.vcf' would be your haplotypecaller results with the annotations (sans cravat).

If you run with "--boosting_method none", then you'll get a filtered vcf based on hard cutoffs and that's generally better to use as your 'final' output here (unless you're running with a full transcriptome data set, in which case the GBoost files are generally best).

The 'cancer*' files really need the cravat annotations to work for best sensitivity. I'll let you know when I figure out what's going on with the error you experienced.

best,

~brian

On Tue, May 25, 2021 at 6:06 PM ConcettaDe4 @.***> wrote:

Hi! I have run the command with --no_cravat and I did not get error. At the end in the folder I had the following files:

(singularity) :~/RNAseq_project/trinity_cancer_transcriptome_toolkit/test_1$ ls cromwell-executions test.bqsr.bam test.GBoost-classifier.vcf.gz test.vcf.gz cromwell-workflow-logs test.GBoost-classifier.cancer.igvjs_viewer.html test.star.Aligned.sortedByCoord.out.bam test.annotated.vcf.gz test.GBoost-classifier.cancer.tsv test.star.Log.final.out test.bqsr.bai test.GBoost-classifier.cancer.vcf test.star.SJ.out.tab

I think that now it is OK. Is that right?

Thank you,

Concetta

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/NCIP/ctat-mutations/issues/94#issuecomment-848301554, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRKXZMMMPC3CMAYRVXM4LTPQNPFANCNFSM45QGPKPQ .

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas http://broad.mit.edu/~bhaas

brianjohnhaas commented 3 years ago

Hi Concetta,

I wanted to follow up about the opencravat issue you were experiencing earlier. I ran your test vcf through using my fairly recent installation of open cravat and it seemed to go without any errors. One suggestion might be to reinstall opencravat using the version we specify in the wiki:

as here: https://github.com/NCIP/ctat-mutations/wiki/CTAT-mutations-installation pip install open-cravat==2.0.1

then delete your current cravat directory:

rm -r /home/stefania/RNAseq_project/trinity_cancer_transcriptome_toolkit/GRCh38_gencode_v33_CTAT_lib_Apr062020.plug-nplay/ctat_genome_lib_build_dir/ctat_mutation_lib/cravat

(always be careful with those 'rm -r' commands)

and reinstall the libs like this:

mkdir oc config md /home/stefania/RNAseq_project/trinity_cancer_transcriptome_toolkit/GRCh38_gencode_v33_CTAT_lib_Apr062020.plug-nplay/ctat_genome_lib_build_dir/ctat_mutation_lib/cravat oc module install-base oc module install --yes vest chasmplus vcfreporter mupit clinvar

hopefully it all works fine after that.

best,

~brian

ConcettaDe4 commented 3 years ago

Hi! I re-installed the cravat software as you suggested and now it works! Thank you for your help!

Concetta

brianjohnhaas commented 3 years ago

great news!

On Tue, Jun 8, 2021 at 10:20 AM ConcettaDe4 @.***> wrote:

Hi! I re-installed the cravat software as you suggested and now it works! Thank you for your help!

Concetta

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/NCIP/ctat-mutations/issues/94#issuecomment-856810369, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRKX7LE7NZYM2PIASWLF3TRYRJDANCNFSM45QGPKPQ .

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas http://broad.mit.edu/~bhaas