Closed yuxinnnnnn closed 2 years ago
Greetings! It looks like I had let these simulation scripts fall behind a bit, as the command lines and output file formats have changed a bit since we first published this work. I just pushed a commit 352b3cc89b8de63a31c19e898b09d551d50ffbe1 that updates these scripts and they should now be able to complete,
Hello, Thanks for the reply!
I tried the new version of codes, but a slightly different error message appeared:
(base) yuxinzhou@Lluvia-MacBook-Pro:~/Downloads/telogator/simulations$ python3 simulate_reads.py -o sim_dir/ -c default.cfg
found index /Users/yuxinzhou/Downloads/telogator/resources/t2t-telogator-ref.fa.fai
reading default.cfg...
pbsim --model_qc /Users/yuxinzhou/Downloads/telogator/resources/model_qc_clr --data-type CLR --length-mean 20000 --length-sd 200 --length-min 19000 --length-max 21000 --accuracy-mean 1.0 --accuracy-sd 0.005 --depth 40 --prefix sim_dir/sim_c-40_r-20000_a-0_m-0/sim sim_dir/sim_c-40_r-20000_a-0_m-0/ref.fa > sim_dir/sim_c-40_r-20000_a-0_m-0/log.txt 2>&1
simulating reads...
consolidating reads into single fasta...
pbmm2 align /Users/yuxinzhou/Downloads/telogator/resources/t2t-telogator-ref.fa sim_dir/sim_c-40_r-20000_a-0_m-0/sim.fa sim_dir/sim_c-40_r-20000_a-0_m-0/aln.bam --preset SUBREAD --sample sample --rg '@RG\tID:movie1' -j 6 -J 3 --sort
aligning reads with pbmm2...
dyld[7226]: Library not loaded: '@rpath/libdeflate.so'
Referenced from: '/Users/yuxinzhou/opt/anaconda3/lib/libhts.1.9.dylib'
Reason: tried: '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/bin/../lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/bin/../lib/libdeflate.so' (no such file), '/usr/local/lib/libdeflate.so' (no such file), '/usr/lib/libdeflate.so' (no such file)
[E::hts_open_format] Failed to open file "sim_dir/sim_c-40_r-20000_a-0_m-0/aln.bam" : No such file or directory
samtools view: failed to open "sim_dir/sim_c-40_r-20000_a-0_m-0/aln.bam" for reading: No such file or directory
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
exit codes: [1, 1, 1, 1, 1, 1]
using default telomere kmers.
Error: no pickles found in input dir matching prefix: tel-data
Similar error messgae for the Library not loaded: '@rpath/libdeflate.so'
issue, but this time I got
Error: no pickles found in input dir matching prefix: tel-data
at the very end...
Thank you for your time Yuxin
It looks like the pbmm2 command that should be generating the aln.bam files is not running. Are you able to run pbmm2 from command line normally? There might be an issue the conda environment it's installed in.
Thank you for providing the insight.
I tried to remove the pbmm2
from conda and reinstalled it, but a different version of error occured (no more Library not loaded
issue).
pbmm2 align /Users/yuxinzhou/Documents/Telomere/telogator/resources/t2t-telogator-ref.fa sim_dir/sim_c-40_r-20000_a-15_m-0/sim.fa sim_dir/sim_c-40_r-20000_a-15_m-0/aln.bam --preset SUBREAD --sample sample --rg '@RG\tID:movie1' -j 6 -J 3 --sort
aligning reads with pbmm2...
sh: /Users/yuxinzhou/miniconda3/bin/pbmm2: cannot execute binary file
[E::hts_open_format] Failed to open file "sim_dir/sim_c-40_r-20000_a-15_m-0/aln.bam" : No such file or directory
samtools view: failed to open "sim_dir/sim_c-40_r-20000_a-15_m-0/aln.bam" for reading: No such file or directory
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
exit codes: [1, 1, 1, 1, 1, 1]
usage: merge_jobs.py [-h] -i * in_dir/ [-r [hifi]] [-t [p90]] [-cd [2000]] [-cp [15]] [-rc [2]] [-ra [3]] [-ta [max]]
[-tc treecuts.tsv] [-th [0]] [-gt tlens.tsv] [-rl [50000]] [-k plot_kmers.tsv] [-m muscle] [--pbsim]
[--tel-color-plots] [--plot-denoised-tel] [--more-tlen-plots] [--more-readlen-plots] [--nucl-consensus]
[--telogator-pickle]
merge_jobs.py: error: unrecognized arguments: -cr 1
Also, when I ran pbmm2 align sim.fa ref.fa test.bam
, I got
-bash: /Users/yuxinzhou/miniconda3/bin/pbmm2: cannot execute binary file
You are right. I believe the problem is due to pbmm2
.
Currently, I am using the version:
$ conda list | grep pbmm2
pbmm2 1.8.0 hdfd78af_0 bioconda/label/broken
And it seemed that only conda install -c "bioconda/label/broken" pbmm2
worked for me...
Which version of pbmm2 are you using? Would you have any suggestion for the issue?
Thanks, Yuxin
Hmmm. On Red Hat Linux I'm using the latest version of pbmm2 provided by PacBio alongside SMRTLink11 (pbmm2 version 1.8.0). On my personal computer, running OS X 10.14, I use a rather old version of pbmm2 (1.3.0). It wouldn't surprise me if they haven't caught up yet to the latest M1 Macs yet, such that there aren't any working pbmm2 binaries available via Anaconda yet. Two possible alternatives I could imagine:
(1) create a Docker container running some flavor of linux, and install pbmm2 within the container. E.g. using https://hub.docker.com/r/continuumio/miniconda/#! as a base
(2) swap out pbmm2 for another aligner such as minimap2 or winnowmap2. I'd need to make some adjustments to the simulation script to facilitate this, but it's doable.
I think I do have minimap2
on my laptop.
I also retried to create a new env in conda (as well as on another laotop), and this time I got this...
(test) yuxinzhou@Lluvia-MacBook-Pro:~/Documents/Telomere/telogator/simulations$ python simulate_reads.py -o sim_dir/ -c default.cfg
found index /Users/yuxinzhou/Documents/Telomere/telogator/resources/t2t-telogator-ref.fa.fai
reading default.cfg...
pbsim --model_qc /Users/yuxinzhou/Documents/Telomere/telogator/resources/model_qc_clr --data-type CLR --length-mean 20000 --length-sd 200 --length-min 19000 --length-max 21000 --accuracy-mean 1.0 --accuracy-sd 0.005 --depth 40 --prefix sim_dir/sim_c-40_r-20000_a-0_m-0/sim sim_dir/sim_c-40_r-20000_a-0_m-0/ref.fa > sim_dir/sim_c-40_r-20000_a-0_m-0/log.txt 2>&1
simulating reads...
consolidating reads into single fasta...
pbmm2 align /Users/yuxinzhou/Documents/Telomere/telogator/resources/t2t-telogator-ref.fa sim_dir/sim_c-40_r-20000_a-0_m-0/sim.fa sim_dir/sim_c-40_r-20000_a-0_m-0/aln.bam --preset SUBREAD --sample sample --rg '@RG\tID:movie1' -j 6 -J 3 --sort
aligning reads with pbmm2...
>|> 20221006 15:00:18.700 -|- WARN -|- CheckPositionalArgs -|- 0x2049db600|| -|- Input is FASTA. Output BAM file cannot be used for polishing with GenomicConsensus!
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
exit codes: [1, 1, 1, 1, 1, 1]
usage: merge_jobs.py [-h] -i * in_dir/ [-r [hifi]] [-t [p90]] [-cd [2000]] [-cp [15]] [-rc [2]] [-ra [3]] [-ta [max]]
[-tc treecuts.tsv] [-th [0]] [-gt tlens.tsv] [-rl [50000]] [-k plot_kmers.tsv] [-m muscle] [--pbsim]
[--tel-color-plots] [--plot-denoised-tel] [--more-tlen-plots] [--more-readlen-plots] [--nucl-consensus]
[--telogator-pickle]
merge_jobs.py: error: unrecognized arguments: -cr 1
pbsim --model_qc /Users/yuxinzhou/Documents/Telomere/telogator/resources/model_qc_clr --data-type CLR --length-mean 20000 --length-sd 200 --length-min 19000 --length-max 21000 --accuracy-mean 0.95 --accuracy-sd 0.005 --depth 40 --prefix sim_dir/sim_c-40_r-20000_a-5_m-0/sim sim_dir/sim_c-40_r-20000_a-5_m-0/ref.fa > sim_dir/sim_c-40_r-20000_a-5_m-0/log.txt 2>&1
simulating reads...
consolidating reads into single fasta...
pbmm2 align /Users/yuxinzhou/Documents/Telomere/telogator/resources/t2t-telogator-ref.fa sim_dir/sim_c-40_r-20000_a-5_m-0/sim.fa sim_dir/sim_c-40_r-20000_a-5_m-0/aln.bam --preset SUBREAD --sample sample --rg '@RG\tID:movie1' -j 6 -J 3 --sort
aligning reads with pbmm2...
>|> 20221006 15:00:22.131 -|- WARN -|- CheckPositionalArgs -|- 0x204f80600|| -|- Input is FASTA. Output BAM file cannot be used for polishing with GenomicConsensus!
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
exit codes: [1, 1, 1, 1, 1, 1]
usage: merge_jobs.py [-h] -i * in_dir/ [-r [hifi]] [-t [p90]] [-cd [2000]] [-cp [15]] [-rc [2]] [-ra [3]] [-ta [max]]
[-tc treecuts.tsv] [-th [0]] [-gt tlens.tsv] [-rl [50000]] [-k plot_kmers.tsv] [-m muscle] [--pbsim]
[--tel-color-plots] [--plot-denoised-tel] [--more-tlen-plots] [--more-readlen-plots] [--nucl-consensus]
[--telogator-pickle]
merge_jobs.py: error: unrecognized arguments: -cr 1
So maybe the problem is still due to pbmm2
?
Thanks again
Did you pull the latest commit of the telogator repository? That looks like the syntax issue I corrected in commit 352b3cc89b8de63a31c19e898b09d551d50ffbe1
Sorry, I forgot to use the latest version.
With that, I got:
(test) yuxinzhou@Lluvia-MacBook-Pro:~/Downloads/telogator/simulations$ python simulate_reads.py -o sim_dir/ -c default.cfg
found index /Users/yuxinzhou/Downloads/telogator/resources/t2t-telogator-ref.fa.fai
reading default.cfg...
pbsim --model_qc /Users/yuxinzhou/Downloads/telogator/resources/model_qc_clr --data-type CLR --length-mean 20000 --length-sd 200 --length-min 19000 --length-max 21000 --accuracy-mean 1.0 --accuracy-sd 0.005 --depth 40 --prefix sim_dir/sim_c-40_r-20000_a-0_m-0/sim sim_dir/sim_c-40_r-20000_a-0_m-0/ref.fa > sim_dir/sim_c-40_r-20000_a-0_m-0/log.txt 2>&1
simulating reads...
consolidating reads into single fasta...
pbmm2 align /Users/yuxinzhou/Downloads/telogator/resources/t2t-telogator-ref.fa sim_dir/sim_c-40_r-20000_a-0_m-0/sim.fa sim_dir/sim_c-40_r-20000_a-0_m-0/aln.bam --preset SUBREAD --sample sample --rg '@RG\tID:movie1' -j 6 -J 3 --sort
aligning reads with pbmm2...
>|> 20221006 16:50:01.907 -|- WARN -|- CheckPositionalArgs -|- 0x205195600|| -|- Input is FASTA. Output BAM file cannot be used for polishing with GenomicConsensus!
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
exit codes: [1, 1, 1, 1, 1, 1]
using default telomere kmers.
Error: no pickles found in input dir matching prefix: tel-data
So it changed frommerge_jobs.py: error: unrecognized arguments: -cr 1
to Error: no pickles found in input dir matching prefix: tel-data
.
Hello, I would let you know that the issue is solved. Thanks!
Hi, I am a MacOS user with Montenery (version 12.6) and Chip Apple M1 Pro. I tried to generate the simulated read with:
but gives me the following error messages:
I do have
libdeflate
on my computer:Or the error is due to
merge_jobs.py: error: unrecognized arguments: -cr 1
?Could you provide me any insight solving this issue?
Thank you,
Yuxin