ComparativeGenomicsToolkit / cactus

Official home of genome aligner based upon notion of Cactus graphs
Other
499 stars 109 forks source link

cactus-align error #948

Open dudududu12138 opened 1 year ago

dudududu12138 commented 1 year ago

Hello, I constructed the graph pangenome according to the process (https://github.com/ComparativeGenomicsToolkit/cactus/blob/master/doc/pangenome.md (the quick start section)) you gave. The first two steps are ok.(1.cactus-minigraph 2.cactus-graphmap) But the third step report an error. My cactus version is v2.4.2 My codes are listed below:

singularity exec $sif cactus-minigraph \
 ./jobstore1 seqfile1.txt hg38.nrs.sv.gfa.gz --reference Ref \
 --defaultMemory 80G \
 --defaultCores 20 \
 --defaultDisk 50G  

singularity exec $sif cactus-graphmap \
 ./jobstore1 seqfile1.txt hg38.nrs.sv.gfa.gz hg38.nrs.paf \
 --reference Ref --outputFasta hg38.nrs.sv.gfa.fa.gz

singularity exec $sif cactus-align \
 ./jobstore1 seqfile1.txt hg38.nrs.paf \
 hg38.nrs.hal \
 --pangenome --outVG --reference Ref \
 --defaultCores 20 --defaultMemory 80G

The error messages are listed below:

[2023-03-07T15:25:57+0800] [MainThread] [I] [toil.statsAndLogging] Enabling realtime logging in Toil
[2023-03-07T15:25:57+0800] [MainThread] [I] [toil.statsAndLogging] Cactus Command: /home/cactus/cactus_env/bin/cactus-align ./jobstore2 seqfile1.txt hg38.nrs.paf hg38.nrs1.hal --pangenome --outGFA --reference Ref --defaultCores 20 --defaultMemory 80G
[2023-03-07T15:25:57+0800] [MainThread] [I] [toil.statsAndLogging] Cactus Commit: 2d6c076af80c66f4948dade61043e68c6bec5a47
[2023-03-07T15:26:01+0800] [MainThread] [I] [toil.statsAndLogging] Importing file:///lustre/home/acct-clswcc/clswcc-zqd/data/hg38/hg38.fa
[2023-03-07T15:26:01+0800] [MainThread] [I] [toil.statsAndLogging] Importing file:///lustre/home/acct-clswcc/clswcc-jd/gastricCancer/data/NRS/New.nonreference.fa
[2023-03-07T15:26:01+0800] [MainThread] [I] [toil.statsAndLogging] Importing file:///lustre/home/acct-clswcc/clswcc-jd/gastricCancer/cactus/hg38.nrs.sv.gfa.fa.gz
[2023-03-07T15:26:01+0800] [MainThread] [I] [toil.job] Saving graph of 1 jobs, 1 new
[2023-03-07T15:26:01+0800] [MainThread] [I] [toil.job] Processing job 'batch_align_jobs' kind-batch_align_jobs/instance-9ga7zuee v0
[2023-03-07T15:26:01+0800] [MainThread] [I] [toil] Running Toil version 5.9.2-54bfe0b146b76ecc6221de384c255e1be89547c6 on host cas595.pi.sjtu.edu.cn.
[2023-03-07T15:26:01+0800] [MainThread] [I] [toil.realtimeLogger] Starting real-time logging.
[2023-03-07T15:26:01+0800] [MainThread] [I] [toil.leader] Issued job 'batch_align_jobs' kind-batch_align_jobs/instance-9ga7zuee v1 with job batch system ID: 0 and disk: 2.0 Gi, memory: 74.5 Gi, cores: 20.0, accelerators: [], preemptible: False
[2023-03-07T15:26:02+0800] [MainThread] [I] [toil.worker] Redirecting logging to /tmp/df23b7ffcc27594cad42a85c2779d82a/7072/worker_log.txt
[2023-03-07T15:26:02+0800] [MainThread] [I] [toil.leader] 0 jobs are running, 0 jobs are issued and waiting to run
[2023-03-07T15:26:02+0800] [MainThread] [I] [toil.leader] Issued job 'filter_paf' kind-filter_paf/instance-vb_5py6q v1 with job batch system ID: 1 and disk: 2.0 Gi, memory: 74.5 Gi, cores: 20.0, accelerators: [], preemptible: False
[2023-03-07T15:26:02+0800] [MainThread] [I] [toil.leader] Issued job 'sanitize_fasta_headers' kind-sanitize_fasta_headers/instance-ui3sycnh v1 with job batch system ID: 2 and disk: 2.0 Gi, memory: 74.5 Gi, cores: 20.0, accelerators: [], preemptible: False
[2023-03-07T15:26:03+0800] [MainThread] [I] [toil.worker] Redirecting logging to /tmp/df23b7ffcc27594cad42a85c2779d82a/5869/worker_log.txt
[2023-03-07T15:26:03+0800] [MainThread] [I] [toil-rt] Running PAF filter with minBlock=250000 minMAPQ=5 minIdentity=0.5
[2023-03-07T15:26:03+0800] [MainThread] [I] [toil-rt] 2023-03-07 15:26:03.342496: Running the command: "gaffilter /tmp/df23b7ffcc27594cad42a85c2779d82a/5869/01e9/tmpshutr74p/mg.paf.filter -p -r 5.0 -m 0.0 -b 250000 -q 5 -i 0.5"
[2023-03-07T15:26:03+0800] [Thread-1 (daddy)] [E] [toil.batchSystems.singleMachine] Got exit code 1 (indicating failure) from job _toil_worker filter_paf file:/lustre/home/acct-clswcc/clswcc-jd/gastricCancer/cactus/jobstore2 kind-filter_paf/instance-vb_5py6q.
[2023-03-07T15:26:03+0800] [MainThread] [W] [toil.leader] Job failed with exit value 1: 'filter_paf' kind-filter_paf/instance-vb_5py6q v1
Exit reason: None
[2023-03-07T15:26:03+0800] [MainThread] [W] [toil.leader] The job seems to have left a log file, indicating failure: 'filter_paf' kind-filter_paf/instance-vb_5py6q v2
[2023-03-07T15:26:03+0800] [MainThread] [W] [toil.leader] Log from job "kind-filter_paf/instance-vb_5py6q" follows:
=========>
        [2023-03-07T15:26:03+0800] [MainThread] [I] [toil.worker] ---TOIL WORKER OUTPUT LOG---
        [2023-03-07T15:26:03+0800] [MainThread] [I] [toil] Running Toil version 5.9.2-54bfe0b146b76ecc6221de384c255e1be89547c6 on host cas595.pi.sjtu.edu.cn.
        [2023-03-07T15:26:03+0800] [MainThread] [I] [toil.worker] Working on job 'filter_paf' kind-filter_paf/instance-vb_5py6q v1
        [2023-03-07T15:26:03+0800] [MainThread] [I] [toil.worker] Loaded body Job('filter_paf' kind-filter_paf/instance-vb_5py6q v1) from description 'filter_paf' kind-filter_paf/instance-vb_5py6q v1
        [2023-03-07T15:26:03+0800] [MainThread] [I] [toil-rt] Running PAF filter with minBlock=250000 minMAPQ=5 minIdentity=0.5
        [2023-03-07T15:26:03+0800] [MainThread] [I] [toil-rt] 2023-03-07 15:26:03.342496: Running the command: "gaffilter /tmp/df23b7ffcc27594cad42a85c2779d82a/5869/01e9/tmpshutr74p/mg.paf.filter -p -r 5.0 -m 0.0 -b 250000 -q 5 -i 0.5"
        [2023-03-07T15:26:03+0800] [MainThread] [W] [toil.fileStores.abstractFileStore] Failed job accessed files:
        [2023-03-07T15:26:03+0800] [MainThread] [W] [toil.fileStores.abstractFileStore] Downloaded file 'files/no-job/file-fa11f10eb7c84f6f96d532c267c72d5f/hg38.nrs.paf' to path '/tmp/df23b7ffcc27594cad42a85c2779d82a/5869/01e9/tmpshutr74p/mg.paf'
        Traceback (most recent call last):
          File "/home/cactus/cactus_env/lib/python3.10/site-packages/toil/worker.py", line 403, in workerScript
            job._runner(jobGraph=None, jobStore=jobStore, fileStore=fileStore, defer=defer)
          File "/home/cactus/cactus_env/lib/python3.10/site-packages/toil/job.py", line 2743, in _runner
            returnValues = self._run(jobGraph=None, fileStore=fileStore)
          File "/home/cactus/cactus_env/lib/python3.10/site-packages/toil/job.py", line 2660, in _run
            return self.run(fileStore)
          File "/home/cactus/cactus_env/lib/python3.10/site-packages/toil/job.py", line 2888, in run
            rValue = userFunction(*((self,) + tuple(self._args)), **self._kwargs)
          File "/home/cactus/cactus_env/lib/python3.10/site-packages/cactus/refmap/cactus_graphmap.py", line 453, in filter_paf
            cactus_call(parameters=['gaffilter', filter_paf_path, '-p', '-r', str(overlap_ratio), '-m', str(length_ratio),
          File "/home/cactus/cactus_env/lib/python3.10/site-packages/cactus/shared/common.py", line 839, in cactus_call
            raise RuntimeError("{}Command {} exited {}: {}".format(sigill_msg, call, process.returncode, out))
        RuntimeError: Command /usr/bin/time -v gaffilter /tmp/df23b7ffcc27594cad42a85c2779d82a/5869/01e9/tmpshutr74p/mg.paf.filter -p -r 5.0 -m 0.0 -b 250000 -q 5 -i 0.5 exited 134: stdout=None, stderr=gaffilter: paf.hpp:75: PafLine parse_paf_line(const string&): Assertion `tag_toks.size() == 3' failed.
        Command terminated by signal 6
                Command being timed: "gaffilter /tmp/df23b7ffcc27594cad42a85c2779d82a/5869/01e9/tmpshutr74p/mg.paf.filter -p -r 5.0 -m 0.0 -b 250000 -q 5 -i 0.5"
                User time (seconds): 0.02
                System time (seconds): 0.03
                Percent of CPU this job got: 66%
                Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.09
                Average shared text size (kbytes): 0
                Average unshared data size (kbytes): 0
                Average stack size (kbytes): 0
                Average total size (kbytes): 0
                Maximum resident set size (kbytes): 6380
                Average resident set size (kbytes): 0
                Major (requiring I/O) page faults: 10
                Minor (reclaiming a frame) page faults: 1923
                Voluntary context switches: 123
                Involuntary context switches: 1
                Swaps: 0
                File system inputs: 1600
                File system outputs: 0
                Socket messages sent: 0
                Socket messages received: 0
                Signals delivered: 0
                Page size (bytes): 4096
                Exit status: 0

        [2023-03-07T15:26:03+0800] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host cas595.pi.sjtu.edu.cn
<=========
glennhickey commented 1 year ago

That is very strange. My only guess is that hg38.nrs.paf is somehow corrupt. Can you share the output of tail hg38.nrs.paf ?

dudududu12138 commented 1 year ago

That is very strange. My only guess is that hg38.nrs.paf is somehow corrupt. Can you share the output of tail hg38.nrs.paf ?

Thank you for your reply. This is my output of tail hg38.nrs.paf

id=Ref|HLA-B*67:01:02   2675    0       2675    +       id=_MINIGRAPH_|s3148    2675    0       2675    2675    2675    60      cg:Z:2675M
id=Ref|HLA-B*82:02:01   3050    0       3050    +       id=_MINIGRAPH_|s3153    3050    0       3050    3050    3050    60      cg:Z:3050M
id=Ref|HLA-C*01:21      2895    0       2895    +       id=_MINIGRAPH_|s3162    2895    0       2895    2895    2895    60      cg:Z:2895M
id=Ref|HLA-C*02:10      2893    0       2893    +       id=_MINIGRAPH_|s3167    2893    0       2893    2893    2893    60      cg:Z:2893M
id=Ref|HLA-C*08:04:01   2895    0       2895    +       id=_MINIGRAPH_|s3256    2895    0       2895    2895    2895    60      cg:Z:2895M
id=Ref|HLA-C*12:13      3058    0       3058    +       id=_MINIGRAPH_|s3271    3058    0       3058    3058    3058    60      cg:Z:3058M
id=Ref|HLA-C*12:22      2895    0       2895    +       id=_MINIGRAPH_|s3273    2895    0       2895    2895    2895    60      cg:Z:2895M
id=Ref|HLA-C*15:13      2895    0       2895    +       id=_MINIGRAPH_|s3282    2895    0       2895    2895    2895    60      cg:Z:2895M
id=Ref|HLA-C*15:16      3066    0       3066    +       id=_MINIGRAPH_|s3283    3066    0       3066    3066    3066    60      cg:Z:3066M
id=Ref|HLA-C*16:02:01   2895    0       2895    +       id=_MINIGRAPH_|s3287    2895    0       2895    2895    2895    60      cg:Z:2895M

This is the top5 headers of my output in file hg38.nrs.sv.gfa.fa.gz.

>id=_MINIGRAPH_|s1
>id=_MINIGRAPH_|s2
>id=_MINIGRAPH_|s3
>id=_MINIGRAPH_|s4
>id=_MINIGRAPH_|s5

This is my seqfile.

(Ref:1.0,seq1:1.0,_MINIGRAPH_:1.0);
Ref     MyPath/hg38.fa
seq1    MyPath/seq1.fa
_MINIGRAPH_     file:///MyPath/hg38.nrs.sv.gfa.fa.gz
glennhickey commented 1 year ago

hmm, I wonder if the *'s in the names are causing problems... Are you able to share the input to cactus-align so I can try to reproduce?

dudududu12138 commented 1 year ago

hmm, I wonder if the *'s in the names are causing problems... Are you able to share the input to cactus-align so I can try to reproduce?

Oh,thanks!!! But the hg38.nrs.sv.gfa.fa.gz file is very large and I can't upload to the public website. So I share the Reference sequence and my seq1.You can reproduce from the first step. This is my reference fasta file: Reference This is my seq1 fasta file: seq1.fa This is my seqfile: seqfile.txt This is the paf file produeced in step 2: paf

glennhickey commented 1 year ago

I will take a look. But please be aware that Cactus is not designed to work on this type of data. See here for an explanation: https://github.com/ComparativeGenomicsToolkit/cactus/blob/master/doc/pangenome.md#hprc-graph-setup-and-name-munging

As mentioned in that link, if you want to have multiple copies of the same allele, they need to be associated with different samples. This should probably be more prominently warned in the documentation. I would like to make an example of how to do this with GRCh38 but have not yet had enough time / volunteers. But for now, if you want to make an HLA pangenome, you need a separate fasta (and seqfile line) for each allele.

dudududu12138 commented 1 year ago

I will take a look. But please be aware that Cactus is not designed to work on this type of data. See here for an explanation: https://github.com/ComparativeGenomicsToolkit/cactus/blob/master/doc/pangenome.md#hprc-graph-setup-and-name-munging

As mentioned in that link, if you want to have multiple copies of the same allele, they need to be associated with different samples. This should probably be more prominently warned in the documentation. I would like to make an example of how to do this with GRCh38 but have not yet had enough time / volunteers. But for now, if you want to make an HLA pangenome, you need a separate fasta (and seqfile line) for each allele.

I will take a look. But please be aware that Cactus is not designed to work on this type of data. See here for an explanation: https://github.com/ComparativeGenomicsToolkit/cactus/blob/master/doc/pangenome.md#hprc-graph-setup-and-name-munging

As mentioned in that link, if you want to have multiple copies of the same allele, they need to be associated with different samples. This should probably be more prominently warned in the documentation. I would like to make an example of how to do this with GRCh38 but have not yet had enough time / volunteers. But for now, if you want to make an HLA pangenome, you need a separate fasta (and seqfile line) for each allele.

Hello, sorry for the late reply. I dropped the HLA sequences and constructed the pangenome. The previously reported error is resolved. But a new error has appeared. This error also occurs at the cactus-align step. My error log file is listed here: logError.txt. To solve this new error, I renamed my fasta files. But it didn't work. The top5 headers of my reference fasta file and my NRS fasta file are listed below:

>chr1
>chr2
>chr3
>chr4
>chr5
>NRS#1#WGC012904D_622982
>NRS#1#WGC012904D_5534358
>NRS#1#WGC012904D_9614265
>NRS#1#WGC012904D_1067906
>NRS#1#WGC012904D_5090934

Thank you for your help!

glennhickey commented 1 year ago

This error, segfault in hal2vg, seems like a bug. I'd imagine that splitting by chromosome (see the documentation) would resolve this. Still, I'd be curious to know why it's crashing. Are you able to share "out.hal" with me somehow so I can reproduce?

(you would need to rerun your failing command with --disableCaching --cleanWorkDir never --restart to save out.hal)

dudududu12138 commented 1 year ago

This error, segfault in hal2vg, seems like a bug. I'd imagine that splitting by chromosome (see the documentation) would resolve this. Still, I'd be curious to know why it's crashing. Are you able to share "out.hal" with me somehow so I can reproduce?

(you would need to rerun your failing command with --disableCaching --cleanWorkDir never --restart to save out.hal)

Hello,I tried with your suggesstion to get the .hal file. But it report an error. This is my code:

singularity exec $sif cactus-align \
 --disableCaching --cleanWorkDir never --restart \
 ./jobstore seqfile.txt hg38.nrs.paf hg38.nrs.hal \
 --workDir $TEMP --logFile tmp.log \
 --pangenome --outVG --reference Ref \
 --defaultCores 40 --defaultMemory 160G --defaultDisk 200G

And the error log file is here: newlogError.txt.

By the way,what do you mean "splitting by chromosome"? I only have two fasta files, one for GRCh38 and one for NRS sequence, and my goal is to insert NRS (non reference sequence) sequence into GRCh38 to construct the graph pan-genome. If I split GRCh38 by chromosome, should I align my NRS sequence with all the chromosomes?

glennhickey commented 1 year ago

The only way that ocmmand will work is if you mount jobstore and workdir with singularity using -b.

For chromosome splitting, see the example https://github.com/ComparativeGenomicsToolkit/cactus/blob/master/doc/pangenome.md#yeast-splitting-by-chromosome and https://github.com/ComparativeGenomicsToolkit/cactus/blob/master/doc/pangenome.md#hprc-graph-splitting-by-chromosome

dudududu12138 commented 1 year ago

The only way that ocmmand will work is if you mount jobstore and workdir with singularity using -b.

For chromosome splitting, see the example https://github.com/ComparativeGenomicsToolkit/cactus/blob/master/doc/pangenome.md#yeast-splitting-by-chromosome and https://github.com/ComparativeGenomicsToolkit/cactus/blob/master/doc/pangenome.md#hprc-graph-splitting-by-chromosome

Sorry, I am confused with your advice. What's the meaning of "using -b"? Do you mean the parameter --binariesMode in cactus-align?

glennhickey commented 1 year ago

Sorry, I meant -B. Something like singularity exec -B /home/dudu/cactus:/data cactus-align /data/jobstore --workDir /data --disableCacing --cleanWorkDir never ....

dudududu12138 commented 1 year ago

I

Sorry, I meant -B. Something like singularity exec -B /home/dudu/cactus:/data cactus-align /data/jobstore --workDir /data --disableCacing --cleanWorkDir never ....

Hello, I followed your advice. But I still got error message. This is my codes:

singularity exec -B MyPath:/data $sif cactus-align \
 --disableCaching --cleanWorkDir never --restart \
 /data/jobstore seqfile.txt hg38.nrs.paf hg38.nrs.hal \
 --workDir /data --logFile tmp3.log \
 --pangenome --outVG --reference Ref \
 --defaultCores 40 --defaultMemory 160G --defaultDisk 200G

And this is the error message: logFile.txt.

jiadong324 commented 9 months ago

I run into similar errors for command gaffilter.

Here is a part of my seqfile.txt.

CHM13_h1    chr22_42590000-42655000.fa
HG00171_h1  HG00171_h1-haplotype1-0000002.fa
HG00171_h2  HG00171_h2-haplotype2-0000152.fa
HG00268_h1  HG00268_h1-haplotype1-0000005.fa
HG00268_h2  HG00268_h2-haplotype2-0000050.fa

This is the error message.

RuntimeError: Command ['singularity', '--silent', 'exec', '-u', '-B', '/tmp/d6343a4bdf455ba6bda4cfd0acfa3261/d377/8eb2/tmpdcj3uq7z:/mnt', '--pwd', '/mnt', '/hgsvc_chr22_assemblies/mc/jobStore/cactus.img', 'gaffilter', 'mg.paf.filter', '-p', '-r', '5.0', '-m', '0.0', '-b', '250000', '-q', '5', '-i', '0.5'] signaled SIGABRT: stderr=gaffilter: paf.hpp:75: PafLine parse_paf_line(const string&): Assertion `tag_toks.size() == 3' failed.

Here is the PAF file, and it looks good. mc_hgsvc_cyp2d6.paf.zip