ConesaLab / SQANTI3

Tool for the Quality Control of Long-Read Defined Transcriptomes
GNU General Public License v3.0
198 stars 49 forks source link

GeneMark issue #294

Closed ghost closed 6 months ago

ghost commented 6 months ago

Is there an existing issue for this?

Have you loaded the SQANTI3.env conda environment?

Problem description

I have this error, when running sqanti3_qc.py

Code sample

python3 sqanti3_qc.py example/UHR_chr22.gtf example/gencode.v38.basic_chr22.gtf example/GRCh38.p13_chr22.fasta --CAGE_peak data/ref_TSS_annotation/human.refTSS_v3.1.hg38.bed --polyA_motif_list data/polyA_motifs/mouse_and_human.polyA_motif.txt -o UHR_chr22 -d example/SQANTI3_output1 -fl example/UHR_abundance.tsv --short_reads example/UHR_chr22_short_reads.fofn --cpus 4 --report both

Error

Rscript (R) version 4.3.2 (2023-10-31) Write arguments to /Users/LGJ-15/Tools/biotools/SQANTI3-5.2.1/example/SQANTI3_output1/UHR_chr22.params.txt... Running SQANTI3... Parsing provided files.... Reading genome fasta /Users/LGJ-15/Tools/biotools/SQANTI3-5.2.1/example/GRCh38.p13_chr22.fasta.... Skipping aligning of sequences because GTF file was provided.

Indels will be not calculated since you ran SQANTI3 without alignment step (SQANTI3 with gtf format as transcriptome input). **** Predicting ORF sequences... /Users/LGJ-15/Tools/biotools/SQANTI3-5.2.1/utilities/gmst/probuild: /Users/LGJ-15/Tools/biotools/SQANTI3-5.2.1/utilities/gmst/probuild: cannot execute binary file GeneMarkS: error on last system call, error code 32256 Abort program!!! Traceback (most recent call last): File "/Users/LGJ-15/Tools/biotools/SQANTI3-5.2.1/sqanti3_qc.py", line 2525, in main() File "/Users/LGJ-15/Tools/biotools/SQANTI3-5.2.1/sqanti3_qc.py", line 2508, in main run(args) File "/Users/LGJ-15/Tools/biotools/SQANTI3-5.2.1/sqanti3_qc.py", line 1853, in run orfDict = correctionPlusORFpred(args, genome_dict) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/LGJ-15/Tools/biotools/SQANTI3-5.2.1/sqanti3_qc.py", line 587, in correctionPlusORFpred if subprocess.check_call(cmd, shell=True, cwd=gmst_dir)!=0: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/LGJ-15/miniconda3/lib/python3.11/subprocess.py", line 413, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command 'perl /Users/LGJ-15/Tools/biotools/SQANTI3-5.2.1/utilities/gmst/gmst.pl -faa --strand direct --fnn --output /Users/LGJ-15/Tools/biotools/SQANTI3-5.2.1/example/SQANTI3_output1/GMST/GMST_tmp /Users/LGJ-15/Tools/biotools/SQANTI3-5.2.1/example/SQANTI3_output1/UHR_chr22_corrected.fasta' returned non-zero exit status 1.

Anything else?

No response

carolinamonzo commented 6 months ago

Hi @guojunliu7,

This issue has been previously reported in: #104 , #117 , #185 and #199 .

I'm copying below the recommendations from @aarzalluz, which we agree on:

SQANTI3 currently runs GeneMark for ORF prediction, a tool that is unfortunately not maintained/updated. Errors related to GMST will therefore not be worked on, however, given the high amount of issues it is generating, we intend to find an alternative way to predict ORFs for future SQ3 releases as soon as we can.

In the meantime, the --skipORF argument can be used to avoid ORF prediction entirely and prevent SQANTI3 from crashing. We apologize for any inconveniences.