ComparativeGenomicsToolkit / Comparative-Annotation-Toolkit

Apache License 2.0
170 stars 48 forks source link

Unable to run annotation pipeline. Target h2tg000040l block 10438982-10444550 exceeds sequence length 10441754 #318

Open smkumaill opened 7 months ago

smkumaill commented 7 months ago

Hello,

I built a pangenome using Minigraph-cactus pipeline. When I try to annotate the underlying assemblies using the produced hal file I get the following error:

=========> [2024-04-13T10:18:41+0000] [MainThread] [I] [toil.worker] ---TOIL WORKER OUTPUT LOG--- [2024-04-13T10:18:41+0000] [MainThread] [I] [toil] Running Toil version 5.0.0-f182c6420554b258632a40bfa47a8f69e56675e4 on host A1002. [2024-04-13T10:18:41+0000] [MainThread] [I] [toil.worker] Working on job 'JobFunctionWrappingJob' kind-JobFunctionWrappingJob/9/instance-rmbeyxru [2024-04-13T10:18:42+0000] [MainThread] [I] [luigi-interface] Loaded [] [2024-04-13T10:18:44+0000] [MainThread] [I] [toil.worker] Loaded body Job('JobFunctionWrappingJob' kind-JobFunctionWrappingJob/9/instance-rmbeyxru) from description 'JobFunctionWrappingJob' kind-JobFuncti onWrappingJob/9/instance-rmbeyxru [2024-04-13T10:18:44+0000] [MainThread] [I] [toil.fileStores.abstractFileStore] LOG-TO-MASTER: Beginning to chain chromosome jor1.2-chrY target h2tg000040l block 10438982-10444550 exceeds sequence length 10441754 [2024-04-13T10:18:57+0000] [MainThread] [W] [toil.fileStores.abstractFileStore] Failed job accessed files: [2024-04-13T10:18:57+0000] [MainThread] [W] [toil.fileStores.abstractFileStore] Downloaded file 'files/no-job/file-0fcb6465dc6b4ebb809de5c0a3dd8573/apr_review_v1_2902_chm13.full.hal' to path '/data/tmp/node-a4459971-75ec-41bd-8132-acd5bd3f0c80-930d802f477a4282ac95aac0ecb410eb/tmpxedmze7p/2d548763-c823-4776-a1b9-94d8c39935d8/tmpbe05uppr.tmp' [2024-04-13T10:18:57+0000] [MainThread] [W] [toil.fileStores.abstractFileStore] Downloaded file 'files/no-job/file-5453c33a040140c983d36e906a973ef3/jor1.2.2bit' to path '/data/tmp/node-a4459971-75ec-41bd-8132-acd5bd3f0c80-930d802f477a4282ac95aac0ecb410eb/tmpxedmze7p/2d548763-c823-4776-a1b9-94d8c39935d8/tmpo44sgatm.tmp' [2024-04-13T10:18:57+0000] [MainThread] [W] [toil.fileStores.abstractFileStore] Downloaded file 'files/no-job/file-b4a8bdaf70e54b39af1922f9d460a775/CHM13.2bit' to path '/data/tmp/node-a4459971-75ec-41bd-8132-acd5bd3f0c80-930d802f477a4282ac95aac0ecb410eb/tmpxedmze7p/2d548763-c823-4776-a1b9-94d8c39935d8/tmpizwtegxf.tmp' Traceback (most recent call last): File "/data/software/miniconda3/envs/cat/lib/python3.7/site-packages/toil/worker.py", line 393, in workerScript job._runner(jobGraph=None, jobStore=jobStore, fileStore=fileStore, defer=defer) File "/data/software/miniconda3/envs/cat/lib/python3.7/site-packages/toil/job.py", line 2358, in _runner returnValues = self._run(jobGraph=None, fileStore=fileStore) File "/data/software/miniconda3/envs/cat/lib/python3.7/site-packages/toil/job.py", line 2279, in _run return self.run(fileStore) File "/data/software/miniconda3/envs/cat/lib/python3.7/site-packages/toil/job.py", line 2502, in run rValue = userFunction(*((self,) + tuple(self._args)), **self._kwargs) File "/data/software/cat/Comparative-Annotation-Toolkit/cat/chaining.py", line 124, in chain_by_chromosome tools.procOps.run_proc(cmd) File "/data/software/cat/Comparative-Annotation-Toolkit/tools/procOps.py", line 73, in run_proc pl.wait() File "/data/software/cat/Comparative-Annotation-Toolkit/tools/pipeline.py", line 1127, in wait self.raiseIfExcept() File "/data/software/cat/Comparative-Annotation-Toolkit/tools/pipeline.py", line 1085, in raiseIfExcept p.raiseIfExcept() File "/data/software/cat/Comparative-Annotation-Toolkit/tools/pipeline.py", line 749, in raiseIfExcept raise self.exceptInfo[0].with_traceback(self.exceptInfo[2]) tools.pipeline.ProcException: process exited 255: docker run -i --rm -u 0:0 --env TMPDIR=/data/tmp -v /data/tmp:/data/tmp -v /data/tmp/node-a4459971-75ec-41bd-8132-acd5bd3f0c80-930d802f477a4282ac95aac0ecb410eb/tmpxedmze7p/2d548763-c823-4776-a1b9-94d8c39935d8:/data/tmp/node-a4459971-75ec-41bd-8132-acd5bd3f0c80-930d802f477a4282ac95aac0ecb410eb/tmpxedmze7p/2d548763-c823-4776-a1b9-94d8c39935d8 quay.io/ucsc_cgl/cat axtChain -psl -verbose=0 -linearGap=medium /dev/stdin /data/tmp/node-a4459971-75ec-41bd-8132-acd5bd3f0c80-930d802f477a4282ac95aac0ecb410eb/tmpxedmze7p/2d548763-c823-4776-a1b9-94d8c39935d8/tmpo44sgatm.tmp /data/tmp/node-a4459971-75ec-41bd-8132-acd5bd3f0c80-930d802f477a4282ac95aac0ecb410eb/tmpxedmze7p/2d548763-c823-4776-a1b9-94d8c39935d8/tmpizwtegxf.tmp /data/tmp/node-a4459971-75ec-41bd-8132-acd5bd3f0c80-930d802f477a4282ac95aac0ecb410eb/tmpxedmze7p/2d548763-c823-4776-a1b9-94d8c39935d8/A1002.547740.2247346878.tmp [2024-04-13T10:18:57+0000] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host A1002

<=========

I am using CHM13v2 downloaded from https://github.com/marbl/CHM13 as the reference. And the UCSC GENCODEv35 CAT/Liftoff v2 (https://s3-us-west-2.amazonaws.com/human-pangenomics/T2T/CHM13/assemblies/annotation/chm13.draft_v2.0.gene_annotation.gff3) file from the repository.

I am using this annotation as I am unable to reformat the JHU RefSeqv110 + Liftoff v5.1 (https://s3-us-west-2.amazonaws.com/human-pangenomics/T2T/CHM13/assemblies/annotation/chm13v2.0_RefSeq_Liftoff_v5.1.gff3.gz) gff3 file to CAT specifications