Closed chunlinxiao closed 1 year ago
Can you post the error log from 5-untip/untip.err?
Here it is:
cat 5-untip/untip.err
UntipRelative Unitigify 3 Combine mappings Combine edges Find lengths Fix coverage Pop bubbles based on coverage Unroll simple loops Unitigify 3b Traceback (most recent call last): File ".../miniconda3/lib/verkko/scripts/unitigify.py", line 101, inif node not in belongs_to_unitig: start_unitig(">" + node, unitigs, belongs_to_unitig, edges) File ".../miniconda3/lib/verkko/scripts/unitigify.py", line 33, in start_unitig while len(edges[new_unitig[-1]]) == 1 and getone(edges[new_unitig[-1]])[1:] != new_unitig[-1][1:]: KeyError: '>utig3a-98885'
Haven't seen that before. Are you able to share the full 5-untip folder? You can see the FAQ here on how to send us data: https://canu.readthedocs.io/en/latest/faq.html#how-can-i-send-data-to-you. I may need some other folders but lets start with that one first.
the file "5-untip.tar.gz" was uploaded as suggested - thanks!
I also encountered the same error. I just installed rukki, looking forward to a solution!
This should be fixed by the commit (which auto-closed the issue). You should be able to just replace the script in your conda install and re-run. However, your assembly graph looked quite fragmented. I doubt the Hi-C phasing, at least with default parameters, will work well here. What type of input data do you have for your genome?
Thanks @skoren - the data used for verkko testing (hifi + ONT + HiC) were all from GIAB HG002.
After I reinstalled rukki, the error was exactly the same as before. This is my 5-untip/untip.err message. Excuse me, is there any better solution?
UntipRelative
Unitigify 3
Combine mappings
Combine edges
Find lengths
Fix coverage
Pop bubbles based on coverage
Unroll simple loops
Unitigify 3b
Combine mappings
Combine edges
Find lengths
Fix coverage
Unroll simple loops round 2
Unitigify 4
The error doesn't look the same, there's no error message in the untip.err file. How did you re-install? The fix isn't part of a release, you just need to patch the one python script. What was the full log of the run that didn't finish?
For my latest testing, the previous 5-unitig error was gone, but now stopped at 8-hicPipeline.
Looks like this related to run_mashmap.err:
log: 8-hicPipeline/transform_bwa.err
jobid: 150
reason: Missing output files: 8-hicPipeline/hic_mapping.byread.output; Input files updated by another job: 8-hicPipeline/hic_to_assembly.sorted_by_read.bam
threads: 8
resources: tmpdir=/tmp, job_id=1, n_cpus=8, mem_gb=16, time_h=24
[Tue Jul 4 21:20:36 2023]
Finished job 150.
61 of 64 steps (95%) done
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2023-07-03T095104.968992.snakemake.log
ERROR!
Not running final consensus since no rukki paths provided!
8-hicPipeline/run_mashmap.err
mashmap: error while loading shared libraries: libmkl_rt.so.2: cannot open shared object file: No such file or directory
Yes, that's a mashmap issue with a missing math library. Is mashmap also installed from conda? Could you check mashmap --version?
I believe it was installed through conda verkko installation ( I did not install mashmap specifically).
which mashmap ~/miniconda3/bin/mashmap
mashmap --version mashmap: error while loading shared libraries: libmkl_rt.so.2: cannot open shared object file: No such file or directory
Conda has been a pain recently, in this case it seems it installed an invalid version of mashmap. I'm not even sure why it's got the dependency, that's not a mashmap-included library. Not something we can fix within verkko since we're just relying on conda to solve the environment correctly and install working tools. Try uninstalling or updating the latest mashmap version, 3.0.5 and see if it runs then.
I confirmed (see linked issue) that this shouldn't be a dependency with mashmap but may have been an issue with mashmap v3.0.4 installations in conda. So updating as I suggested above should fix your issue.
just installed mashmap (3.0.6) and re-run - but got the following errors:
[Mon Jul 10 14:42:15 2023]
Error in rule hicPhasing:
jobid: 177
input: 8-hicPipeline/unitigs.matches, 8-hicPipeline/hic_mapping.byread.output, 8-hicPipeline/unitigs.hpc.noseq.gfa
output: 8-hicPipeline/hic.byread.compressed, 8-hicPipeline/hicverkko.colors.tsv
log: 8-hicPipeline/hic_phasing.err (check log file(s) for error details)
shell:
cd 8-hicPipeline
cat > ./hic_phasing.sh <<EOF
#!/bin/sh
set -e
~/miniconda3/lib/verkko/scripts/hicverkko.py False False .
EOF
chmod +x ./hic_phasing.sh
./hic_phasing.sh > ../8-hicPipeline/hic_phasing.err 2>&1
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2023-07-10T143726.119298.snakemake.log
ERROR!
tail 8-hicPipeline/hic_phasing.err
Traceback (most recent call last):
File "~/miniconda3/lib/verkko/scripts/hicverkko.py", line 8, in <module>
import cluster
File "~/miniconda3/lib/verkko/scripts/cluster.py", line 3, in <module>
import networkx as nx
ModuleNotFoundError: No module named 'networkx'
Ah that is a missing dependency in the conda package, if you install networkx with conda it should get past this error. I'll add that to the next verkko build.
"conda install networkx" should install it, but after I did, for some reason, verkko still complained that networkx could not be found - kind weird !
Finally I used "pip install networkx --user" command to install - now it is running again ( in step of: output: 6-layoutContigs/unitig-popped.layout, 6-layoutContigs/unitig-popped.layout.scfmap, 6-layoutContigs/gaps.txt)
Hope this process will go all the way this time!
Again, verkko stopped with an error (below):
The log mentioned by message was actually NOT there (.snakemake/log/2023-07-10T163056.780339.snakemake.log).
[Wed Jul 12 11:57:00 2023]
Error in rule generateConsensus:
jobid: 36
input: 7-consensus/packages/part016.cnspack, 7-consensus/packages.tigName_to_ID.map, 7-consensus/packages.report
output: 7-consensus/packages/part016.fasta
log: 7-consensus/packages/part016.err (check log file(s) for error details)
shell:
cd 7-consensus
mkdir -p packages
cat > ./packages/part016.sh <<EOF
#!/bin/sh
set -e
~/miniconda3/lib/verkko/bin/utgcns \\
-V -V -V \\
-threads 8 \\
-import ../7-consensus/packages/part016.cnspack \\
-A ../7-consensus/packages/part016.fasta.WORKING \\
-C 2 -norealign \\
-maxcoverage 50 \\
-e 0.05 \\
-em 0.20 \\
-EM 0 \\
-l 3000 \\
-edlib \\
&& \\
mv ../7-consensus/packages/part016.fasta.WORKING ../7-consensus/packages/part016.fasta \\
&& \\
exit 0
echo ""
echo "Consensus did not finish successfully, exit code \$?."
echo ""
echo "Files in current directory:"
ls -ltr
echo ""
echo "Files in packages/:"
ls -ltr packages
exit 1
EOF
chmod +x ./packages/part016.sh
./packages/part016.sh > ../7-consensus/packages/part016.err 2>&1
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
[Wed Jul 12 13:31:08 2023]
Finished job 31.
29 of 34 steps (85%) done
[Wed Jul 12 18:23:17 2023]
Finished job 30.
30 of 34 steps (88%) done
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2023-07-10T163056.780339.snakemake.log
cp: cannot stat '*.fasta': No such file or directory
cp: cannot stat '*.layout': No such file or directory
Are you running this on a cluster? I'd guess this job ran out of time on the cluster. Can you post the end of part016.err?
just on local (not on a cluster).
>tail 7-consensus/packages/part016.err
generatePBDAG()-- read alignment: 0 failed, 97 passed.
Constructing graph
Merging graph
Calling consensus
Bye.
18996 46 45 4.59x 0 0.00x 1 1.00x
83090 124298 77 15 1.01x 0 0.00x 62 4.63x
83120 167647 61 58 4.88x 0 0.00x 3 1.55x
83140 97573 97 25 2.10x 0 0.00x 72 6.74x
That log file has no error so no idea why snakemake thinks it failed. It might have been an intermittent I/O issue on the system. What's in the 7-consensus/packages folder? You can try running verkko with the --snakeopts --dry-run
to see where it will resume.
the following is from dry run ("--snakeopts --dry-run ") : does this mean both part012 and part016 failed?
Launching bioconda verkko bioconda 1.4
Using snakemake 7.30.1.
Building DAG of jobs...
Nothing to be done (all requested files are present and up to date).
Launching bioconda verkko bioconda 1.4
Using snakemake 7.30.1.
Building DAG of jobs...
Job stats:
job count min threads max threads
----------------- ------- ------------- -------------
cnspath 1 1 1
combineConsensus 1 1 1
generateConsensus 2 8 8
total 4 1 8
[Thu Jul 13 11:41:20 2023]
rule generateConsensus:
input: 7-consensus/packages/part016.cnspack, 7-consensus/packages.tigName_to_ID.map, 7-consensus/packages.report
output: 7-consensus/packages/part016.fasta
log: 7-consensus/packages/part016.err
jobid: 29
reason: Missing output files: 7-consensus/packages/part016.fasta
wildcards: nnnn=016
threads: 8
resources: tmpdir=/tmp, job_id=16, n_cpus=8, mem_gb=6, time_h=24
[Thu Jul 13 11:41:20 2023]
rule generateConsensus:
input: 7-consensus/packages/part012.cnspack, 7-consensus/packages.tigName_to_ID.map, 7-consensus/packages.report
output: 7-consensus/packages/part012.fasta
log: 7-consensus/packages/part012.err
jobid: 25
reason: Missing output files: 7-consensus/packages/part012.fasta
wildcards: nnnn=012
threads: 8
resources: tmpdir=/tmp, job_id=12, n_cpus=8, mem_gb=7, time_h=24
[Thu Jul 13 11:41:20 2023]
rule combineConsensus:
input: 7-consensus/packages/part001.fasta, 7-consensus/packages/part002.fasta, 7-consensus/packages/part003.fasta, 7-consensus/packages/part004.fasta, 7-consensus/packages/part005.fasta, 7-consensus/packages/part006.fasta, 7-consensus/packages/part007.fasta, 7-consensus/packages/part008.fasta, 7-consensus/packages/part009.fasta, 7-consensus/packages/part010.fasta, 7-consensus/packages/part011.fasta, 7-consensus/packages/part012.fasta, 7-consensus/packages/part013.fasta, 7-consensus/packages/part014.fasta, 7-consensus/packages/part015.fasta, 7-consensus/packages/part016.fasta, 7-consensus/packages/part017.fasta, 7-consensus/packages/part018.fasta, 7-consensus/packages/part019.fasta, 7-consensus/packages/part020.fasta, 7-consensus/packages/part021.fasta, 7-consensus/packages/part022.fasta, 7-consensus/packages/part023.fasta, 7-consensus/packages/part024.fasta, 7-consensus/packages/part025.fasta, 7-consensus/packages/part026.fasta, 7-consensus/packages/part027.fasta, 7-consensus/packages/part028.fasta, 7-consensus/packages/part029.fasta, 7-consensus/packages.tigName_to_ID.map, 6-layoutContigs/unitig-popped.layout.scfmap, 5-untip/unitig-unrolled-unitig-unrolled-popped-unitig-normal-connected-tip.hifi-coverage.csv, 7-consensus/packages.finished, emptyfile, 5-untip/unitig-unrolled-unitig-unrolled-popped-unitig-normal-connected-tip.gfa
output: 7-consensus/unitig-popped.fasta, 7-consensus/unitig-popped.haplotype1.fasta, 7-consensus/unitig-popped.haplotype2.fasta, 7-consensus/unitig-popped.unassigned.fasta
log: 7-consensus/combineConsensus.out, 7-consensus/combineConsensus.err
jobid: 11
reason: Missing output files: 7-consensus/unitig-popped.unassigned.fasta, 7-consensus/unitig-popped.haplotype1.fasta, 7-consensus/unitig-popped.haplotype2.fasta, 7-consensus/unitig-popped.fasta; Input files updated by another job: 7-consensus/packages/part016.fasta, 7-consensus/packages/part012.fasta
resources: tmpdir=/tmp, job_id=1, n_cpus=1, mem_gb=7, time_h=4
[Thu Jul 13 11:41:20 2023]
localrule cnspath:
input: 6-layoutContigs/unitig-popped.layout, 6-layoutContigs/unitig-popped.layout.scfmap, 7-consensus/unitig-popped.fasta, 7-consensus/unitig-popped.haplotype1.fasta, 7-consensus/unitig-popped.haplotype2.fasta, 7-consensus/unitig-popped.unassigned.fasta
output: assembly.homopolymer-compressed.layout, assembly.fasta
jobid: 0
reason: Missing output files: assembly.homopolymer-compressed.layout, assembly.fasta; Input files updated by another job: 7-consensus/unitig-popped.fasta, 7-consensus/unitig-popped.unassigned.fasta, 7-consensus/unitig-popped.haplotype1.fasta, 7-consensus/unitig-popped.haplotype2.fasta
resources: tmpdir=/tmp
Job stats:
job count min threads max threads
----------------- ------- ------------- -------------
cnspath 1 1 1
combineConsensus 1 1 1
generateConsensus 2 8 8
total 4 1 8
Reasons:
(check individual jobs above for details)
input files updated by another job:
cnspath, combineConsensus
missing output files:
cnspath, combineConsensus, generateConsensus
This was a dry-run (flag -n). The order of jobs does not reflect the order of execution.
cp: cannot stat '*.fasta': No such file or directory
cp: cannot stat '*.layout': No such file or directory
Either part012 failed or didn't run. Does it have an error log in packages, if so post the end of that log as well and the contents of packages folder.
part012.err seems not very helpful.
>tail 7-consensus/packages/part012.err
81398 251335 83 74 5.53x 0 0.00x 9 2.43x
81454 129664 156 60 3.93x 0 0.00x 96 6.86x
81517 276679 72 70 5.17x 0 0.00x 2 1.76x
81682 290560 74 45 2.48x 0 0.00x 29 2.20x
81929 663293 33 31 1.98x 0 0.00x 2 1.32x
81942 168156 124 32 1.60x 0 0.00x 92 5.06x
82534 182164 109 29 1.34x 0 0.00x 80 4.09x
82535 182024 109 28 1.33x 0 0.00x 81 4.17x
82600 290022 71 68 2.95x 0 0.00x 3 1.61x
82789 591765 35 32 0.49x 0 0.00x 3 1.65x
Now I'm re-running the job again (without dry-run) to see if it can run through this time ....
Neither one of those reported an error. I would have suggested just renaming the WORKING.fasta files to .fasta yourself and resuming from there. I don't think anything failed during those jobs.
no luck to run to the end of the pipeline with NO obvious error messages in the following 3 error files (part012.err, part016.err , part017.err ):
./packages/part017.sh > ../7-consensus/packages/part017.err 2>&1
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
.....
./packages/part016.sh > ../7-consensus/packages/part016.err 2>&1
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
....
./packages/part012.sh > ../7-consensus/packages/part012.err 2>&1
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2023-07-22T122822.430471.snakemake.log
The 3 fasta files were generated apparently, but the pipeline still complained and terminated.
>ls -l 7-consensus/packages/part012.fasta
-rw-rw-r-- 1 xiao2 varpipe 361420690 Jul 18 18:30 7-consensus/packages/part012.fasta
>ls -l 7-consensus/packages/part016.fasta
-rw-rw-r-- 1 xiao2 varpipe 378075971 Jul 18 17:06 7-consensus/packages/part016.fasta
>ls -l 7-consensus/packages/part017.fasta
-rw-rw-r-- 1 xiao2 varpipe 384098989 Jul 18 16:48 7-consensus/packages/part017.fasta
It's strange that no error is being reported and the job is being marked as failed by snakemake when the fasta file is generated. The fasta is only renamed from WORKING to the final name if the consensus command completed with no error and the next thing right after that is exit 0
:
&& mv ../{output.consensus}.WORKING ../{output.consensus} && exit 0
Also, I'm not clear why this is running three consensus jobs, your previous failure and dry-run reported that only two jobs had failed. This seems like some kind of snakemake weirdness/bug (wouldn't be the first time). You should be able to run verkko with --snakeopts --touch
and then --snakeopts --dry-run
which should force it to use the outputs it generated and then resume the run.
I would think it's related to #166 but it's strange there is no final error or failure reported. In addition to trying the above, can you also share one of the package files (e.g. packages/part012.*
).
the latest run with your suggestion failed again (with part12/16/17) - verkko hic-integration seems having some unknowns that caused this run terminated prematurely.
Just send package-part12.tar.gz to you as before.
Thanks for looking into this.
I was able to run the partition without error and it returned 0. The number of sequences output seems to match what you got so I think your file is complete as well. I have no idea why snakemake is detecting these jobs as failed but would guess this is a snakemake bug. Have you tried the suggestion to run touch and dry-run to see if it will continue on and use your existing outputs?
Thanks @skoren.
I actually tried before, but not for the latest run - it seemed that verkko could not get out of those jobs as it considered them as "failed" - basically it just re-run and then fail again ! I'm testing touch and dry-run now and seeing it is still generating "part012/16/17.fasta" (even they were actually finished).
Also I just noticed that 3 core dumps sit there under the folder of "7-consensus" (below) - not sure if this is of any help?
7-consensus>ls
assembly.disconnected.fasta combined.fasta.lengths packages.finished
assembly.disconnected.ids core.142176 packages.readName_to_ID.map
assembly.fasta core.183134 packages.report
assembly.ids core.47416 packages.tigName_to_ID.map
buildPackages.err extractONT.err screen-assembly.err
buildPackages.sh extractONT.sh screen-assembly.out
combineConsensus.err ont_subset.extract unitig-popped.fasta
combineConsensus.out ont_subset.fasta.gz unitig-popped.haplotype1.fasta
combineConsensus.sh ont_subset.id unitig-popped.haplotype2.fasta
combined.fasta packages unitig-popped.unassigned.fasta
>ls -l 7-consensus/core*
Jul 17 11:41 7-consensus/core.142176
Jul 17 03:22 7-consensus/core.183134
Jul 17 03:13 7-consensus/core.47416
>gdb --core=7-consensus/core.142176 |more
...
Core was generated by `~/miniconda3/lib/verkko/bin/utgcns -V -V -V -threads 8 -import ../7-c'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000000000000000 in ?? ()
[Current thread is 1 (LWP 142182)]
>gdb --core=7-consensus/core.47416 |more
...
Core was generated by `~/miniconda3/lib/verkko/bin/utgcns -V -V -V -threads 8 -import ../7-c'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000000000000000 in ?? ()
[Current thread is 1 (LWP 50067)]
>gdb --core=7-consensus/core.183134 |more
....
Core was generated by `~/miniconda3/lib/verkko/bin/utgcns -V -V -V -threads 8 -import ../7-c'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000000000000000 in ?? ()
[Current thread is 1 (LWP 183499)]
Core would imply the jobs failed, at least at some point, but I would expect that to show up in the logs. In your original consensus error the -EM parameter was set to 0 but the ones you shared were correct so that could have been the source of the original core dump. Your run looks very strange as the folder also has the files combined.fasta, assembly.fasta, and even haplotype-split results which shouldn't be created unless the assembly finished consensus and moved onto the next step.
Is this a run that finished that you re-started with changed parameters? Is this the 7-consensus folder under the top-level folder or under 8-hicPipeline/final_contigs/7-consensus (assuming you're running w/hic)? Something is very off with this folder if that is the case because, like I said, many of these files shouldn't exist if the consensus partitions didn't finish. I think the multiple restarts have caused some kind of snakemake tracking issue and I would suggest making a new folder and copying over all the [1-8]- folders to it but not the 8-hicPipeline/final_contigs/7-consensus folder or any files named `assembly.` and seeing if that will run properly.
some update:
I did not make any changes to the parameters myself (except those touch/dry-run).
Previous sharing was from the top level (verkko_out/7-consensus/, not verkko_out/8-hicPipeline/final_contigs/7-consensus/ that I just noticed from you).
Now I do see some differences in scripts of those failed 3 jobs between top level (verkko_out/7-consensus/) and verkko_out/8-hicPipeline/final_contigs/7-consensus:
eg.
In verkko_out/7-consensus/packages/part012.sh (also in part016.sh, part017.sh)
-EM 241302 \
but in verkko_out/8-hicPipeline/final_contigs/7-consensus/packages/part012.sh (in fact, all 29 jobs same here)
-EM 0 \
But after I mv .fasta.WORKING to .fasta for those 3 jobs (under verkko_out/8-hicPipeline/final_contigs/7-consensus), I re-run the pipeline and it did get into the end finally.
Ah OK, that explains why the output showed no error. The HiC folder layout is a bit confusing and we plan to standardize it to match the trio/other runs in the near future. I am pretty sure this is the same as #166 since the -EM was not 0 in the initial consensus and became 0 later. I would suggest updating the -EM to match what it was in the top-level 7-consensus folder and re-running the partitions in 8-hicPipeline/final_contigs/7-consensus/packages. After that, snakemake should redo the last steps, you may have to remove the assembly.*
files. This change will improve the consensus quality of your final assembly and should avoid the segfault.
These issues should all be addressed by the v1.4.1 release. I've also made a pull request to update conda: https://github.com/bioconda/bioconda-recipes/pull/42411
Just installed the latest v1.4 using conda (with no installation issue), and run verkko with --hic1/--hic2 option, but encountered error like below:
I added rukki (~/miniconda3/lib/verkko/bin/rukki) to PATH and re-run, but still failed with same error.
Any suggestion?
Thanks