TrinityCTAT / ctat-mutations

Mutation detection using GATK4 best practices and latest RNA editing filters resources. Works with both Hg38 and Hg19
https://github.com/TrinityCTAT/ctat-mutations
Other
73 stars 18 forks source link

subprocess.CalledProcessError #110

Open min0609 opened 2 years ago

min0609 commented 2 years ago

Hi, First of all, thank you for this wonderful tool. I used CTAT_Mutations v3.2.0 in Conda environment and I am wondering why 'subprocess.CalledProcessError' is happening.

the command line is:

python ctat-mutations-CTAT-Mutations-v3.2.0/ctat_mutations --genome_lib_dir GRCh38_gencode_v37_CTAT_lib_Mar012021.plug-n-play/ctat_genome_lib_build_dir/ --left ctat-mutations-CTAT-Mutations-v3.2.0/testing/reads_1.fastq.gz --right ctat-mutations-CTAT-Mutations-v3.2.0/testing/reads_2.fastq.gz --sample_id test --outputdir outdir/test

and the error message is:

error_message

May I ask you why this is happening? Thank you very much in advance!

brianjohnhaas commented 2 years ago

Hi,

It looks like it's not finding bedtools.

The best way to run ctat mutations given all its installation requirements is to just use our singularity or docker images. Singularity is easiest IMO.

https://github.com/NCIP/ctat-mutations/wiki/CTAT-mutations-installation

best,

~b

On Mon, Jul 11, 2022 at 5:43 AM KwangminYoo @.***> wrote:

Hi, First of all, thank you for this wonderful tool. I used CTAT_Mutations v3.2.0 in Conda environment and I am wondering why 'subprocess.CalledProcessError' is happening.

the command line is:

python ctat-mutations-CTAT-Mutations-v3.2.0/ctat_mutations --genome_lib_dir GRCh38_gencode_v37_CTAT_lib_Mar012021.plug-n-play/ctat_genome_lib_build_dir/ --left ctat-mutations-CTAT-Mutations-v3.2.0/testing/reads_1.fastq.gz --right ctat-mutations-CTAT-Mutations-v3.2.0/testing/reads_2.fastq.gz --sample_id test --outputdir outdir/test

and the error message is:

[image: error_message] https://user-images.githubusercontent.com/105622235/178235105-b82a8102-e335-492e-9dd9-e1149fe2e661.png

May I ask you why this is happening? Thank you very much in advance!

— Reply to this email directly, view it on GitHub https://github.com/NCIP/ctat-mutations/issues/110, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRKXYBLXS3ME5Q6VMYQ2LVTPUDHANCNFSM53G4ON5Q . You are receiving this because you are subscribed to this thread.Message ID: @.***>

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas http://broad.mit.edu/~bhaas

min0609 commented 2 years ago

Thank you for your help, Brian!

i used singularity to run ctat mutations as you recommended, but it gave me another error.

The command line is:

singularity exec -e -Bpwd:/data -B /ycga-gpfs/scratch60/lifton/jc2545/escc/CTAT_mutation_apptainer3.2.0/GRCh38_gencode_v37_CTAT_lib_Mar012021.plug-n-play/ctat_genome_lib_build_dir:/ctat_genome_lib_dir:ro /ycga-gpfs/scratch60/lifton/jc2545/.apptainer/ctat_mutations.v3.2.0.simg /usr/local/src/ctat-mutations/ctat_mutations --left /data/reads_1.fastq.gz --right /data/reads_2.fastq.gz --sample_id test --output /data/out --cpu 10 --genome_lib_dir /ctat_genome_lib_dir

The error message is:

ctat_mutations_singularity_error

Could you please suggest a way to fix it?

Thanks, again!

brianjohnhaas commented 2 years ago

Hi,

If you're using the small sample data, there aren't enough features to run the machine learning / boosting steps on. Just include

--boosting_method none

and it'll skip that part.

With larger inputs, running the default boosting generally improves the results slightly.

best,

~b

On Sun, Jul 17, 2022 at 10:58 PM KwangminYoo @.***> wrote:

Thank you for your help, Brian!

i used singularity to run ctat mutations as you recommended, but it gave me another error.

The command line is:

singularity exec -e -B pwd:/data -B /ycga-gpfs/scratch60/lifton/jc2545/escc/CTAT_mutation_apptainer3.2.0/GRCh38_gencode_v37_CTAT_lib_Mar012021.plug-n-play/ctat_genome_lib_build_dir:/ctat_genome_lib_dir:ro /ycga-gpfs/scratch60/lifton/jc2545/.apptainer/ctat_mutations.v3.2.0.simg /usr/local/src/ctat-mutations/ctat_mutations --left /data/reads_1.fastq.gz --right /data/reads_2.fastq.gz --sample_id test --output /data/out --cpu 10 --genome_lib_dir /ctat_genome_lib_dir

The error message is: [image: ctat_mutations_singularity_error] https://user-images.githubusercontent.com/105622235/179438070-441aaff7-1352-4b30-be8b-46c7952c81f7.png

Could you please suggest a way to fix it?

Thanks, again!

— Reply to this email directly, view it on GitHub https://github.com/NCIP/ctat-mutations/issues/110#issuecomment-1186709848, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRKX2TDFLCD5A52IO66R3VUTB6PANCNFSM53G4ON5Q . You are receiving this because you commented.Message ID: @.***>

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas http://broad.mit.edu/~bhaas

min0609 commented 2 years ago

Thanks to your help again,

I was able to successfully run ctat_mutations with a small test sample. However, I have an additional question. My data is single end sequencing data (single cell RNA-seq), and which option should I use for the command line?

Could you please check the below command line is right?

singularity exec -e -Bpwd:/data -B /ycga-gpfs/scratch60/lifton/jc2545/escc/CTAT_mutation_apptainer3.2.0/GRCh38_gencode_v37_CTAT_lib_Mar012021.plug-n-play/ctat_genome_lib_build_dir:/ctat_genome_lib_dir:ro /ycga-gpfs/scratch60/lifton/jc2545/.apptainer/ctat_mutations.v3.2.0.simg /usr/local/src/ctat-mutations/ctat_mutations --left /data/fastq/EGS-20-D3-NUC_S2_L004_R2_001.fastq.gz --sample_id EGS-20-D3-NUC --output /data/out_EGS-20-D3-NUC --cpu 10 --genome_lib_dir /ctat_genome_lib_dir

Thanks, again!

brianjohnhaas commented 2 years ago

Hi,

The command looks right - you just set the --left parameter if you have single-end reads. We haven't looked at the impact of boosting on single cell data yet. If it gives you trouble, you could disable that here too.

On Wed, Jul 20, 2022 at 3:59 AM KwangminYoo @.***> wrote:

Thanks to your help again,

I was able to successfully run ctat_mutations with a small test sample. However, I have an additional question. My data is single end sequencing data (single cell RNA-seq), and which option should I use for the command line?

Could you please check the below command line is right?

singularity exec -e -B pwd:/data -B /ycga-gpfs/scratch60/lifton/jc2545/escc/CTAT_mutation_apptainer3.2.0/GRCh38_gencode_v37_CTAT_lib_Mar012021.plug-n-play/ctat_genome_lib_build_dir:/ctat_genome_lib_dir:ro /ycga-gpfs/scratch60/lifton/jc2545/.apptainer/ctat_mutations.v3.2.0.simg /usr/local/src/ctat-mutations/ctat_mutations --left /data/fastq/EGS-20-D3-NUC_S2_L004_R2_001.fastq.gz --sample_id EGS-20-D3-NUC --output /data/out_EGS-20-D3-NUC --cpu 10 --genome_lib_dir /ctat_genome_lib_dir

Thanks, again!

— Reply to this email directly, view it on GitHub https://github.com/NCIP/ctat-mutations/issues/110#issuecomment-1189954220, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRKX3BPODMBAATYG4NEM3VU6WVXANCNFSM53G4ON5Q . You are receiving this because you commented.Message ID: @.***>

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas http://broad.mit.edu/~bhaas

min0609 commented 2 years ago

Thank you so much!

min0609 commented 2 years ago

Hi,

I'm running ctat on 33 samples, but I'm getting an error only for 5 samples. I have a question as this error has never been reported before. I Attach the error message file for each sample.

EGS_28_D1_stderr.txt EGS_28_D2_stderr.txt EGS_37_T_stderr.txt ESCC_21_N4_NUC_stderr.txt ESCC_24_D1_NUC_sderr.txt

The command line is: singularity exec -e -Bpwd:/data -B /ycga-gpfs/scratch60/lifton/jc2545/escc/CTAT_mutation_apptainer3.2.0/GRCh38_gencode_v37_CTAT_lib_Mar012021.plug-n-play/ctat_genome_lib_build_dir:/ctat_genome_lib_dir:ro /ycga-gpfs/scratch60/lifton/jc2545/.apptainer/ctat_mutations.v3.2.0.simg /usr/local/src/ctat-mutations/ctat_mutations --left /data/fastq/EGS_28_D1.fastq.gz --sample_id EGS_28_D1 --output /data/out_100pct/out_EGS_28_D1 --cpu 10 --genome_lib_dir /ctat_genome_lib_dir --boosting_method none

Could you please give me some advice on how to solve it? Thank you very much in advance!

brianjohnhaas commented 2 years ago

Hi,

I'm not seeing any fatal error messages here, just logging info. Is the process not completing? If it's timing out, it can be rerun and should pick up where it left off. Some of the annotation steps can take a while to complete.

~b

On Sat, Aug 13, 2022 at 10:41 AM KwangminYoo @.***> wrote:

Hi,

I'm running ctat on 33 samples, but I'm getting an error only for 5 samples. I have a question as this error has never been reported before. I Attach the error message file for each sample.

EGS_28_D1_stderr.txt https://github.com/NCIP/ctat-mutations/files/9331601/EGS_28_D1_stderr.txt EGS_28_D2_stderr.txt https://github.com/NCIP/ctat-mutations/files/9331603/EGS_28_D2_stderr.txt EGS_37_T_stderr.txt https://github.com/NCIP/ctat-mutations/files/9331604/EGS_37_T_stderr.txt ESCC_21_N4_NUC_stderr.txt https://github.com/NCIP/ctat-mutations/files/9331605/ESCC_21_N4_NUC_stderr.txt ESCC_24_D1_NUC_sderr.txt https://github.com/NCIP/ctat-mutations/files/9331606/ESCC_24_D1_NUC_sderr.txt

Could you please give me some advice on how to solve it? Thank you very much in advance!

— Reply to this email directly, view it on GitHub https://github.com/NCIP/ctat-mutations/issues/110#issuecomment-1214170218, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRKX4IE7KT5EX2PXUY35DVY6XYXANCNFSM53G4ON5Q . You are receiving this because you commented.Message ID: @.***>

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas http://broad.mit.edu/~bhaas

min0609 commented 2 years ago

When I ran it again, I was able to process all of my samples.

Thank you again!

brianjohnhaas commented 2 years ago

Great, thanks for letting me know!

On Wed, Aug 17, 2022 at 1:08 AM KwangminYoo @.***> wrote:

When I ran it again, I was able to process all of my samples.

Thank you again!

— Reply to this email directly, view it on GitHub https://github.com/NCIP/ctat-mutations/issues/110#issuecomment-1217466927, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRKXZUYA4WVSFYF2BJLXLVZRXTXANCNFSM53G4ON5Q . You are receiving this because you commented.Message ID: @.***>

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas http://broad.mit.edu/~bhaas

min0609 commented 2 years ago

Hi,

I need a file 'HaplotypeCaller.raw_variants.vcf_snpeff.vcf_adj.vcf_dbsnp.vcf_RNAedit.vcf_PASSreads.vcf.gz' to run a tool that identifies mutational signatures. However, I could not find a file with that name. Should I use a different version than the ctat_mutations.v3.2.0.simg(singularity) I used? btw, I am using single-end scrna-seq data.

Thanks again for offering help.

brianjohnhaas commented 2 years ago

Hi,

The filenames for intermediate outputs may have changed across different software releases.

There should be a vcf output file that contains the annotated VCF. Are you finding that output file in the main output directory?

best,

~brian

On Mon, Aug 29, 2022 at 8:57 AM KwangminYoo @.***> wrote:

Hi,

I need a file 'HaplotypeCaller.raw_variants.vcf_snpeff.vcf_adj.vcf_dbsnp.vcf_RNAedit.vcf_PASSreads.vcf.gz' to run a tool that identifies mutational signatures. However, I could not find a file with that name. Should I use a different version than the ctat_mutations.v3.2.0.simg(singularity) I used? btw, I am using single-end scrna-seq data.

Thanks again for offering help.

— Reply to this email directly, view it on GitHub https://github.com/NCIP/ctat-mutations/issues/110#issuecomment-1230250653, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZRKX63DPLM3CBEFGRA3A3V3SXSNANCNFSM53G4ON5Q . You are receiving this because you commented.Message ID: @.***>

--

Brian J. Haas The Broad Institute http://broadinstitute.org/~bhaas http://broad.mit.edu/~bhaas