TrinityCTAT / ctat-mutations

Mutation detection using GATK4 best practices and latest RNA editing filters resources. Works with both Hg38 and Hg19
https://github.com/TrinityCTAT/ctat-mutations
Other
71 stars 19 forks source link

Annotate Cravat step can not read file #121

Closed shivUSF closed 1 year ago

shivUSF commented 1 year ago

Hello,

I am running the pipeline with the singularity image v3.2.1, overall the variant run is works smoothly, however at Cravat annotation, its again and again having issue reading it. I have re-created the cravat directory three times now and every time it runs into this error.

" rn code. See 'continueOnReturnCode' runtime attribute for more details. Check the content of stderr for potential additional information: /mnt/data/ctat_mutation/220300000234-G07/cromwell-executions/ctat_mutations/8ef65cac-6767-44e7-a474-8a012694ef98/call-AnnotateVariants/annotate_variants_wf/aa170140-fec3-4314-9f8d-9e2d803afe16/call-open_cravat/execution/stderr. [First 3000 bytes]:+ echo '########### Annotate CRAVAT #############'

CMD: annotate_with_cravat: oc run /mnt/data/ctat_mutation/220300000234-G07/cromwell-executions/ctat_mutations/8ef65cac-6767-44e7-a474-8a012694ef98/call-AnnotateVariants/annotate_variants_wf/aa170140-fec3-4314-9f8d-9e2d803afe16/call-open_cravat/inputs/638220550/220300000234-G07.annot_cosmic.vcf.gz --module-option vcfreporter.type=separate --system-option modules_dir=/ctat_genome_lib_dir/ctat_mutation_lib/cravat -t vcf -l hg19 -d -n 220300000234-G07.cravat.tmp "

Traceback (most recent call last): File "/usr/local/src/ctat-mutations/src/annotate_with_cravat.py", line 61, in subprocess.check_call(cravat_cmd) File "/opt/conda/lib/python3.7/subprocess.py", line 347, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['oc', 'run', '/mnt/data/ctat_mutation/220300000234-G07/cromwell-executions/ctat_mutations/8ef65cac-6767-44e7-a474-8a012694ef98/call-AnnotateVariants/annotate_variants_wf/aa170140-fec3-4314-9f8d-9e2d803afe16/call-open_cravat/inputs/638220550/220300000234-G07.annot_cosmic.vcf.gz', '--module-option', 'vcfreporter.type=separate', '--system-option', 'modules_dir=/ctat_genome_lib_dir/ctat_mutation_lib/cravat', '-t', 'vcf', '-l', 'hg19', '-d', '', '-n', '220300000234-G07.cravat.tmp']' returned non-zero exit status 1.

Is there somethings specific with this image file or shall I use some other way to run the pipeline.

brianjohnhaas commented 1 year ago

Hi,

Sorry to hear about the problem.

Could you send me your file ' /mnt/data/ctat_mutation/220300000234-G07/cromwell-executions/ctat_mutations/8ef65cac-6767-44e7-a474-8a012694ef98/call-AnnotateVariants/annotate_variants_wf/aa170140-fec3-4314-9f8d-9e2d803afe16/call-open_cravat/inputs/638220550/220300000234-G07.annot_cosmic.vcf.gz' ?

to

bhaas at broadinstitute dot org

I'll see if I can troubleshoot it further.

best,

~brian

On Wed, Dec 7, 2022 at 1:47 PM shivUSF @.***> wrote:

Hello,

I am running the pipeline with the singularity image v3.2.1, overall the variant run is works smoothly, however at Cravat annotation, its again and again having issue reading it. I have re-created the cravat directory three times now and every time it runs into this error.

" rn code. See 'continueOnReturnCode' runtime attribute for more details. Check the content of stderr for potential additional information: /mnt/data/ctat_mutation/220300000234-G07/cromwell-executions/ctat_mutations/8ef65cac-6767-44e7-a474-8a012694ef98/call-AnnotateVariants/annotate_variants_wf/aa170140-fec3-4314-9f8d-9e2d803afe16/call-open_cravat/execution/stderr. [First 3000 bytes]:+ echo '########### Annotate CRAVAT #############'

cravat_lib_dir=/ctat_genome_lib_dir/ctat_mutation_lib/cravat '[' /ctat_genome_lib_dir/ctat_mutation_lib/cravat == '' ']' export TMPDIR=/tmp TMPDIR=/tmp /usr/local/src/ctat-mutations/src/annotate_with_cravat.py --input_vcf /mnt/data/ctat_mutation/220300000234-G07/cromwell-executions/ctat_mutations/8ef65cac-6767-44e7-a474-8a012694ef98/call-AnnotateVariants/annotate_variants_wf/aa170140-fec3-4314-9f8d-9e2d803afe16/call-open_cravat/inputs/638220550/220300000234-G07.annot_cosmic.vcf.gz --genome hg19 --cravat_lib_dir /ctat_genome_lib_dir/ctat_mutation_lib/cravat --output_vcf 220300000234-G07.cravat.tmp.vcf Traceback (most recent call last): File "/opt/conda/bin/oc", line 5, in from cravat.oc import main File "/opt/conda/lib/python3.7/site-packages/cravat/oc.py", line 2, in from cravat import cravat_admin, cravat_util File "/opt/conda/lib/python3.7/site-packages/cravat/cravat_admin.py", line 451, in au.ready_resolution_console() File "/opt/conda/lib/python3.7/site-packages/cravat/admin_util.py", line 1328, in ready_resolution_console new_md = input(msg) EOFError: EOF when reading a line

CMD: annotate_with_cravat: oc run /mnt/data/ctat_mutation/220300000234-G07/cromwell-executions/ctat_mutations/8ef65cac-6767-44e7-a474-8a012694ef98/call-AnnotateVariants/annotate_variants_wf/aa170140-fec3-4314-9f8d-9e2d803afe16/call-open_cravat/inputs/638220550/220300000234-G07.annot_cosmic.vcf.gz --module-option vcfreporter.type=separate --system-option modules_dir=/ctat_genome_lib_dir/ctat_mutation_lib/cravat -t vcf -l hg19 -d -n 220300000234-G07.cravat.tmp "

Traceback (most recent call last): File "/usr/local/src/ctat-mutations/src/annotate_with_cravat.py", line 61, in subprocess.check_call(cravat_cmd) File "/opt/conda/lib/python3.7/subprocess.py", line 347, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['oc', 'run', '/mnt/data/ctat_mutation/220300000234-G07/cromwell-executions/ctat_mutations/8ef65cac-6767-44e7-a474-8a012694ef98/call-AnnotateVariants/annotate_variants_wf/aa170140-fec3-4314-9f8d-9e2d803afe16/call-open_cravat/inputs/638220550/220300000234-G07.annot_cosmic.vcf.gz', '--module-option', 'vcfreporter.type=separate', '--system-option', 'modules_dir=/ctat_genome_lib_dir/ctat_mutation_lib/cravat', '-t', 'vcf', '-l', 'hg19', '-d', '', '-n', '220300000234-G07.cravat.tmp']' returned non-zero exit status 1.

Is there somethings specific with this image file or shall I use some other way to run the pipeline.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas

brianjohnhaas commented 1 year ago

Hi,

I was able to get it to run through on my system. I did create a new singularity image in case you want to try that:

https://data.broadinstitute.org/Trinity/CTAT_SINGULARITY/CTAT_MUTATIONS/

If that still gives you trouble, I can try to make my cravat section of my ctat genome lib available and we can see if that fixes it.

best,

~brian

On Wed, Dec 7, 2022 at 2:36 PM Brian Haas @.***> wrote:

Hi,

Sorry to hear about the problem.

Could you send me your file ' /mnt/data/ctat_mutation/220300000234-G07/cromwell-executions/ctat_mutations/8ef65cac-6767-44e7-a474-8a012694ef98/call-AnnotateVariants/annotate_variants_wf/aa170140-fec3-4314-9f8d-9e2d803afe16/call-open_cravat/inputs/638220550/220300000234-G07.annot_cosmic.vcf.gz' ?

to

bhaas at broadinstitute dot org

I'll see if I can troubleshoot it further.

best,

~brian

On Wed, Dec 7, 2022 at 1:47 PM shivUSF @.***> wrote:

Hello,

I am running the pipeline with the singularity image v3.2.1, overall the variant run is works smoothly, however at Cravat annotation, its again and again having issue reading it. I have re-created the cravat directory three times now and every time it runs into this error.

" rn code. See 'continueOnReturnCode' runtime attribute for more details. Check the content of stderr for potential additional information: /mnt/data/ctat_mutation/220300000234-G07/cromwell-executions/ctat_mutations/8ef65cac-6767-44e7-a474-8a012694ef98/call-AnnotateVariants/annotate_variants_wf/aa170140-fec3-4314-9f8d-9e2d803afe16/call-open_cravat/execution/stderr. [First 3000 bytes]:+ echo '########### Annotate CRAVAT #############'

cravat_lib_dir=/ctat_genome_lib_dir/ctat_mutation_lib/cravat '[' /ctat_genome_lib_dir/ctat_mutation_lib/cravat == '' ']' export TMPDIR=/tmp TMPDIR=/tmp /usr/local/src/ctat-mutations/src/annotate_with_cravat.py --input_vcf /mnt/data/ctat_mutation/220300000234-G07/cromwell-executions/ctat_mutations/8ef65cac-6767-44e7-a474-8a012694ef98/call-AnnotateVariants/annotate_variants_wf/aa170140-fec3-4314-9f8d-9e2d803afe16/call-open_cravat/inputs/638220550/220300000234-G07.annot_cosmic.vcf.gz --genome hg19 --cravat_lib_dir /ctat_genome_lib_dir/ctat_mutation_lib/cravat --output_vcf 220300000234-G07.cravat.tmp.vcf Traceback (most recent call last): File "/opt/conda/bin/oc", line 5, in from cravat.oc import main File "/opt/conda/lib/python3.7/site-packages/cravat/oc.py", line 2, in from cravat import cravat_admin, cravat_util File "/opt/conda/lib/python3.7/site-packages/cravat/cravat_admin.py", line 451, in au.ready_resolution_console() File "/opt/conda/lib/python3.7/site-packages/cravat/admin_util.py", line 1328, in ready_resolution_console new_md = input(msg) EOFError: EOF when reading a line

CMD: annotate_with_cravat: oc run /mnt/data/ctat_mutation/220300000234-G07/cromwell-executions/ctat_mutations/8ef65cac-6767-44e7-a474-8a012694ef98/call-AnnotateVariants/annotate_variants_wf/aa170140-fec3-4314-9f8d-9e2d803afe16/call-open_cravat/inputs/638220550/220300000234-G07.annot_cosmic.vcf.gz --module-option vcfreporter.type=separate --system-option modules_dir=/ctat_genome_lib_dir/ctat_mutation_lib/cravat -t vcf -l hg19 -d -n 220300000234-G07.cravat.tmp "

Traceback (most recent call last): File "/usr/local/src/ctat-mutations/src/annotate_with_cravat.py", line 61, in subprocess.check_call(cravat_cmd) File "/opt/conda/lib/python3.7/subprocess.py", line 347, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['oc', 'run', '/mnt/data/ctat_mutation/220300000234-G07/cromwell-executions/ctat_mutations/8ef65cac-6767-44e7-a474-8a012694ef98/call-AnnotateVariants/annotate_variants_wf/aa170140-fec3-4314-9f8d-9e2d803afe16/call-open_cravat/inputs/638220550/220300000234-G07.annot_cosmic.vcf.gz', '--module-option', 'vcfreporter.type=separate', '--system-option', 'modules_dir=/ctat_genome_lib_dir/ctat_mutation_lib/cravat', '-t', 'vcf', '-l', 'hg19', '-d', '', '-n', '220300000234-G07.cravat.tmp']' returned non-zero exit status 1.

Is there somethings specific with this image file or shall I use some other way to run the pipeline.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas

shivUSF commented 1 year ago

Ohh thank you, will try with this singularity version first.

shivUSF commented 1 year ago

Hey Brian,

Ran it with the new singularity; now its throwing error at the STARalign step: while reading the genome index files. Its reading '[ ]' characters at the genome files.

-

Check the content of stderr for potential additional information: /mnt/data/ctat_mutation/220300000234-H05/cromwell-executions/ctat_mutations/d138c811-350d-4106-9e41-558e51d893b9/call-StarAlign/execution/stderr. [First 3000 bytes]:+ genomeDir=

EXITING because of FATAL ERROR: could not create output file: /220300000234-H05.star.Log.out SOLUTION: check if the path /220300000234-H05.star. exists and you have permissions to write there

Dec 08 14:56:19 ...... FATAL ERROR, exiting

Should we try just with your cravat section?

brianjohnhaas commented 1 year ago

Interesting. If the issue is happening at STAR, then something else is going on. It might have to do with your initial singularity command. Can you share that?

If you need to share privately, you can email me:

bhaas at broadinstitute dot org

best,

~b

On Thu, Dec 8, 2022 at 4:16 PM shivUSF @.***> wrote:

Hey Brian,

Ran it with the new singularity; now its throwing error at the STARalign step: while reading the genome index files. Its reading '[ ]' characters at the genome files.

-

Check the content of stderr for potential additional information: /mnt/data/ctat_mutation/220300000234-H05/cromwell-executions/ctat_mutations/d138c811-350d-4106-9e41-558e51d893b9/call-StarAlign/execution/stderr. [First 3000 bytes]:+ genomeDir=

'[' '' == '' ']' genomeDir=/ctat_genome_lib_dir/ref_genome.fa.star.idx '[' -f /ctat_genome_lib_dir/ref_genome.fa.star.idx ']' fastqs='/mnt/data/ctat_mutation/220300000234-H05/cromwell-executions/ctat_mutations/d138c811-350d-4106-9e41-558e51d893b9/call-StarAlign/inputs/-465261608/220300000234-H05_22Mar234-H05_L002_R1_001.fastq.gz /mnt/data/ctat_mutation/220300000234-H05/cromwell-executions/ctat_mutations/d138c811-350d-4106-9e41-558e51d893b9/call-StarAlign/inputs/-465261608/220300000234-H05_22Mar234-H05_L002_R2_001.fastq.gz' readFilesCommand= [[ /mnt/data/ctat_mutation/220300000234-H05/cromwell-executions/ctat_mutations/d138c811-350d-4106-9e41-558e51d893b9/call-StarAlign/inputs/-465261608/220300000234-H05_22Mar234-H05_L002_R1_001.fastq.gz == .gz ]] readFilesCommand='--readFilesCommand "gunzip -c"' [[ /mnt/data/ctat_mutation/220300000234-H05/cromwell-executions/ctat_mutations/d138c811-350d-4106-9e41-558e51d893b9/call-StarAlign/inputs/-465261608/220300000234-H05_22Mar234-H05_L002_R1_001.fastq.gz == .tar.gz ]] STAR --genomeDir /ctat_genome_lib_dir/ref_genome.fa.star.idx --runThreadN 8 --readFilesIn /mnt/data/ctat_mutation/220300000234-H05/cromwell-executions/ctat_mutations/d138c811-350d-4106-9e41-558e51d893b9/call-StarAlign/inputs/-465261608/220300000234-H05_22Mar234-H05_L002_R1_001.fastq.gz /mnt/data/ctat_mutation/220300000234-H05/cromwell-executions/ctat_mutations/d138c811-350d-4106-9e41-558e51d893b9/call-StarAlign/inputs/-465261608/220300000234-H05_22Mar234-H05_L002_R2_001.fastq.gz --readFilesCommand '"gunzip' '-c"' --outSAMtype BAM SortedByCoordinate --twopassMode Basic --limitBAMsortRAM 30000000000 --outSAMmapqUnique 60 --outFileNamePrefix /220300000234-H05.star.

EXITING because of FATAL ERROR: could not create output file: /220300000234-H05.star.Log.out SOLUTION: check if the path /220300000234-H05.star. exists and you have permissions to write there

Dec 08 14:56:19 ...... FATAL ERROR, exiting

Should we try just with your cravat section?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas

shivUSF commented 1 year ago

Hello Brian,

I reinstall the ctat_genome_lib and ran again with this command, It did run fine with the STAR and again at the Annotate CRAVAT its gave the same error of the "[" "]" the character issue

---

singularity exec -e  -B /mnt/data/ -B /mnt/data/GRCh37_gencode_v19_CTAT_lib_Mar012021.plug-n-play/ctat_genome_lib_build_dir:/ctat_genome_lib_dir:ro /home/shivani/ctat_mutations.v3.3.0.simg  /usr/local/src/ctat-mutations/ctat_mutations \
--left /home/shivani/CCB_01/gastric_rna_ikl/rna/220300000234-H05/220300000234-H05_22Mar234-H05_L002_R1_001.fastq.gz --right /home/shivani/CCB_01/gastric_rna_ikl/rna/220300000234-H05/220300000234-H05_22Mar234-H05_L002_R2_001.fastq.gz \
--sample_id 220300000234-H05 --boosting_method none --output /mnt/data/ctat_mutation/220300000234-H05 --genome_lib_dir /ctat_genome_lib_dir --cpu 8

---

brianjohnhaas commented 1 year ago

hmm... sounds like something might be a bit off in the cravat lib.

Here's the one that works well for us: https://data.broadinstitute.org/Trinity/CTAT_RESOURCE_LIB/MUTATION_LIB_SUPPLEMENT/cravat_lib/

Try dropping it in as a replacement on your end, and we'll see if that solves the problem.

best,

~brian

On Fri, Dec 9, 2022 at 11:41 AM shivUSF @.***> wrote:

Hello Brian,

I reinstall the ctat_genome_lib and ran again with this command, It did run fine with the STAR and again at the Annotate CRAVAT its gave the same error of the "[" "]" the character issue


singularity exec -e -B /mnt/data/ -B /mnt/data/GRCh37_gencode_v19_CTAT_lib_Mar012021.plug-n-play/ctat_genome_lib_build_dir:/ctat_genome_lib_dir:ro /home/shivani/ctat_mutations.v3.3.0.simg /usr/local/src/ctat-mutations/ctat_mutations \ --left /home/shivani/CCB_01/gastric_rna_ikl/rna/220300000234-H05/220300000234-H05_22Mar234-H05_L002_R1_001.fastq.gz --right /home/shivani/CCB_01/gastric_rna_ikl/rna/220300000234-H05/220300000234-H05_22Mar234-H05_L002_R2_001.fastq.gz \ --sample_id 220300000234-H05 --boosting_method none --output /mnt/data/ctat_mutation/220300000234-H05 --genome_lib_dir /ctat_genome_lib_dir --cpu 8


— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas

shivUSF commented 1 year ago

Well its still throwing the same error -

` Check the content of stderr for potential additional information: /mnt/data/ctat_mutation/220300000234-H05/cromwell-executions/ctat_mutations/195dffb5-8124-4f43-a191-f1970c8afe6b/call-AnnotateVariants/annotate_variants_wf/aac9c6ad-fd77-49b7-b38b-f43aaafff120/call-open_cravat/execution/stderr. [First 3000 bytes]:+ echo '########### Annotate CRAVAT #############'

CMD: annotate_with_cravat: oc run /mnt/data/ctat_mutation/220300000234-H05/cromwell-executions/ctat_mutations/195dffb5-8124-4f43-a191-f1970c8afe6b/call-AnnotateVariants/annotate_variants_wf/aac9c6ad-fd77-49b7-b38b-f43aaafff120/call-open_cravat/inputs/64371628/220300000234-H05.annot_cosmic.vcf.gz --module-option vcfreporter.type=separate --system-option modules_dir=/ctat_genome_lib_dir/ctat_mutation_lib/cravat -t vcf -l hg19 -d -n 220300000234-H05.cravat.tmp

Traceback (most recent call last): File "/usr/local/src/ctat-mutations/src/annotate_with_cravat.py", line 61, in subprocess.check_call(cravat_cmd) File "/opt/conda/lib/python3.7/subprocess.py", line 347, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['oc', 'run', '/mnt/data/ctat_mutation/220300000234-H05/cromwell-executions/ctat_mutations/195dffb5-8124-4f43-a191-f1970c8afe6b/call-AnnotateVariants/annotate_variants_wf/aac9c6ad-fd77-49b7-b38b-f43aaafff120/call-open_cravat/inputs/64371628/220300000234-H05.annot_cosmic.vcf.gz', '--module-option', 'vcfreporter.type=separate', '--system-option', 'modules_dir=/ctat_genome_lib_dir/ctat_mutation_lib/cravat', '-t', 'vcf', '-l', 'hg19', '-d', '', '-n', '220300000234-H05.cravat.tmp']' returned non-zero exit status 1.

`

brianjohnhaas commented 1 year ago

Perhaps it has something to do with the /tmp setting.

Can you add singularity -B /tmp:/tmp

to make sure that it mounts the /tmp area? I just added it to the instructions as I've found it helped in some other scenarios - could be system dependent.

If that doesn't work, we might need to follow up with the cravat team.

best,

~b

shivUSF commented 1 year ago

interesting this seemed to have worked it went through the cravat run this time. Thanks Brian for the help.

brianjohnhaas commented 1 year ago

Great to hear! I wish I thought of this sooner. These things are tricky when they're not reproducible across systems - as in I don't quite understand why it works fine on my system without the additional mount being explicitly set.

Well, documentation is now updated and others will benefit too.

best,

~brian

On Fri, Dec 9, 2022 at 11:38 PM shivUSF @.***> wrote:

interesting this seemed to have worked it went through the cravat run this time. Thanks Brian for the help.

— Reply to this email directly, view it on GitHub https://github.com/NCIP/ctat-mutations/issues/121#issuecomment-1345130220, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRKX4UWIJ2YHJ5GYDRLDDWMQCLZANCNFSM6AAAAAASXFPSWM . You are receiving this because you commented.Message ID: @.***>

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas http://broad.mit.edu/~bhaas

shivUSF commented 1 year ago

Thats true, reproducibility across system make its a little tricky even with docker / singularity but its a good learning lesson. Its funny how sometimes the simplest way is the solution. Thanks Brian for all the help.