zstephens / telogator

A method for measuring chromosome-specific telomere length from long reads
GNU General Public License v3.0
20 stars 1 forks source link

Error Messages When Running "python3 simulate_reads.py -o sim_dir/ -c default.cfg" #6

Closed yuxinnnnnn closed 2 years ago

yuxinnnnnn commented 2 years ago

Hi, I am a MacOS user with Montenery (version 12.6) and Chip Apple M1 Pro. I tried to generate the simulated read with:

python3 simulate_reads.py \
-o sim_dir/ \
-c default.cfg \

but gives me the following error messages:

found index /Users/yuxinzhou/Documents/Telomere/telogator/resources/t2t-telogator-ref.fa.fai
reading default.cfg...
pbsim --model_qc /Users/yuxinzhou/Documents/Telomere/telogator/resources/model_qc_clr --data-type CLR --length-mean 20000 --length-sd 200 --length-min 19000 --length-max 21000 --accuracy-mean 1.0 --accuracy-sd 0.005 --depth 40 --prefix sim_dir/sim_c-40_r-20000_a-0_m-0/sim sim_dir/sim_c-40_r-20000_a-0_m-0/ref.fa > sim_dir/sim_c-40_r-20000_a-0_m-0/log.txt 2>&1
simulating reads...
consolidating reads into single fasta...
pbmm2 align /Users/yuxinzhou/Documents/Telomere/telogator/resources/t2t-telogator-ref.fa sim_dir/sim_c-40_r-20000_a-0_m-0/sim.fa sim_dir/sim_c-40_r-20000_a-0_m-0/aln.bam --preset SUBREAD --sample sample --rg '@RG\tID:movie1' -j 6 -J 3 --sort
aligning reads with pbmm2...
dyld[29151]: Library not loaded: '@rpath/libdeflate.so'
  Referenced from: '/Users/yuxinzhou/opt/anaconda3/lib/libhts.1.9.dylib'
  Reason: tried: '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/bin/../lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/bin/../lib/libdeflate.so' (no such file), '/usr/local/lib/libdeflate.so' (no such file), '/usr/lib/libdeflate.so' (no such file)
[E::hts_open_format] Failed to open file "sim_dir/sim_c-40_r-20000_a-0_m-0/aln.bam" : No such file or directory
samtools view: failed to open "sim_dir/sim_c-40_r-20000_a-0_m-0/aln.bam" for reading: No such file or directory
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
exit codes: [1, 1, 1, 1, 1, 1]
usage: merge_jobs.py [-h] -i * in_dir/ [-r [hifi]] [-t [p90]] [-cd [2000]] [-cp [15]] [-rc [2]] [-ra [3]] [-ta [max]] [-tc treecuts.tsv] [-th [0]] [-gt tlens.tsv] [-rl [50000]]
                     [-k plot_kmers.tsv] [-m muscle] [--pbsim] [--tel-color-plots] [--plot-denoised-tel] [--more-tlen-plots] [--more-readlen-plots] [--nucl-consensus]
                     [--telogator-pickle]
merge_jobs.py: error: unrecognized arguments: -cr 1

I do have libdeflate on my computer:

$ conda list | grep libdeflate
libdeflate                1.8                  h9ed2024_5  

Or the error is due to merge_jobs.py: error: unrecognized arguments: -cr 1 ?

Could you provide me any insight solving this issue?

Thank you,

Yuxin

zstephens commented 2 years ago

Greetings! It looks like I had let these simulation scripts fall behind a bit, as the command lines and output file formats have changed a bit since we first published this work. I just pushed a commit 352b3cc89b8de63a31c19e898b09d551d50ffbe1 that updates these scripts and they should now be able to complete,

yuxinnnnnn commented 2 years ago

Hello, Thanks for the reply!

I tried the new version of codes, but a slightly different error message appeared:

(base) yuxinzhou@Lluvia-MacBook-Pro:~/Downloads/telogator/simulations$ python3 simulate_reads.py -o sim_dir/ -c default.cfg 
found index /Users/yuxinzhou/Downloads/telogator/resources/t2t-telogator-ref.fa.fai
reading default.cfg...
pbsim --model_qc /Users/yuxinzhou/Downloads/telogator/resources/model_qc_clr --data-type CLR --length-mean 20000 --length-sd 200 --length-min 19000 --length-max 21000 --accuracy-mean 1.0 --accuracy-sd 0.005 --depth 40 --prefix sim_dir/sim_c-40_r-20000_a-0_m-0/sim sim_dir/sim_c-40_r-20000_a-0_m-0/ref.fa > sim_dir/sim_c-40_r-20000_a-0_m-0/log.txt 2>&1
simulating reads...
consolidating reads into single fasta...
pbmm2 align /Users/yuxinzhou/Downloads/telogator/resources/t2t-telogator-ref.fa sim_dir/sim_c-40_r-20000_a-0_m-0/sim.fa sim_dir/sim_c-40_r-20000_a-0_m-0/aln.bam --preset SUBREAD --sample sample --rg '@RG\tID:movie1' -j 6 -J 3 --sort
aligning reads with pbmm2...
dyld[7226]: Library not loaded: '@rpath/libdeflate.so'
  Referenced from: '/Users/yuxinzhou/opt/anaconda3/lib/libhts.1.9.dylib'
  Reason: tried: '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/bin/../lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/lib/libdeflate.so' (no such file), '/Users/yuxinzhou/opt/anaconda3/bin/../lib/libdeflate.so' (no such file), '/usr/local/lib/libdeflate.so' (no such file), '/usr/lib/libdeflate.so' (no such file)
[E::hts_open_format] Failed to open file "sim_dir/sim_c-40_r-20000_a-0_m-0/aln.bam" : No such file or directory
samtools view: failed to open "sim_dir/sim_c-40_r-20000_a-0_m-0/aln.bam" for reading: No such file or directory
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
exit codes: [1, 1, 1, 1, 1, 1]
using default telomere kmers.
Error: no pickles found in input dir matching prefix: tel-data

Similar error messgae for the Library not loaded: '@rpath/libdeflate.so' issue, but this time I got

Error: no pickles found in input dir matching prefix: tel-data

at the very end...

Thank you for your time Yuxin

zstephens commented 2 years ago

It looks like the pbmm2 command that should be generating the aln.bam files is not running. Are you able to run pbmm2 from command line normally? There might be an issue the conda environment it's installed in.

yuxinnnnnn commented 2 years ago

Thank you for providing the insight. I tried to remove the pbmm2 from conda and reinstalled it, but a different version of error occured (no more Library not loaded issue).

pbmm2 align /Users/yuxinzhou/Documents/Telomere/telogator/resources/t2t-telogator-ref.fa sim_dir/sim_c-40_r-20000_a-15_m-0/sim.fa sim_dir/sim_c-40_r-20000_a-15_m-0/aln.bam --preset SUBREAD --sample sample --rg '@RG\tID:movie1' -j 6 -J 3 --sort
aligning reads with pbmm2...
sh: /Users/yuxinzhou/miniconda3/bin/pbmm2: cannot execute binary file
[E::hts_open_format] Failed to open file "sim_dir/sim_c-40_r-20000_a-15_m-0/aln.bam" : No such file or directory
samtools view: failed to open "sim_dir/sim_c-40_r-20000_a-15_m-0/aln.bam" for reading: No such file or directory
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
exit codes: [1, 1, 1, 1, 1, 1]
usage: merge_jobs.py [-h] -i * in_dir/ [-r [hifi]] [-t [p90]] [-cd [2000]] [-cp [15]] [-rc [2]] [-ra [3]] [-ta [max]]
                     [-tc treecuts.tsv] [-th [0]] [-gt tlens.tsv] [-rl [50000]] [-k plot_kmers.tsv] [-m muscle] [--pbsim]
                     [--tel-color-plots] [--plot-denoised-tel] [--more-tlen-plots] [--more-readlen-plots] [--nucl-consensus]
                     [--telogator-pickle]
merge_jobs.py: error: unrecognized arguments: -cr 1

Also, when I ran pbmm2 align sim.fa ref.fa test.bam, I got

-bash: /Users/yuxinzhou/miniconda3/bin/pbmm2: cannot execute binary file

You are right. I believe the problem is due to pbmm2.

Currently, I am using the version:

$ conda list | grep pbmm2
pbmm2                     1.8.0                hdfd78af_0    bioconda/label/broken

And it seemed that only conda install -c "bioconda/label/broken" pbmm2 worked for me...

Which version of pbmm2 are you using? Would you have any suggestion for the issue?

Thanks, Yuxin

zstephens commented 2 years ago

Hmmm. On Red Hat Linux I'm using the latest version of pbmm2 provided by PacBio alongside SMRTLink11 (pbmm2 version 1.8.0). On my personal computer, running OS X 10.14, I use a rather old version of pbmm2 (1.3.0). It wouldn't surprise me if they haven't caught up yet to the latest M1 Macs yet, such that there aren't any working pbmm2 binaries available via Anaconda yet. Two possible alternatives I could imagine:

(1) create a Docker container running some flavor of linux, and install pbmm2 within the container. E.g. using https://hub.docker.com/r/continuumio/miniconda/#! as a base

(2) swap out pbmm2 for another aligner such as minimap2 or winnowmap2. I'd need to make some adjustments to the simulation script to facilitate this, but it's doable.

yuxinnnnnn commented 2 years ago

I think I do have minimap2 on my laptop.

I also retried to create a new env in conda (as well as on another laotop), and this time I got this...

(test) yuxinzhou@Lluvia-MacBook-Pro:~/Documents/Telomere/telogator/simulations$ python simulate_reads.py -o sim_dir/ -c default.cfg
found index /Users/yuxinzhou/Documents/Telomere/telogator/resources/t2t-telogator-ref.fa.fai
reading default.cfg...
pbsim --model_qc /Users/yuxinzhou/Documents/Telomere/telogator/resources/model_qc_clr --data-type CLR --length-mean 20000 --length-sd 200 --length-min 19000 --length-max 21000 --accuracy-mean 1.0 --accuracy-sd 0.005 --depth 40 --prefix sim_dir/sim_c-40_r-20000_a-0_m-0/sim sim_dir/sim_c-40_r-20000_a-0_m-0/ref.fa > sim_dir/sim_c-40_r-20000_a-0_m-0/log.txt 2>&1
simulating reads...
consolidating reads into single fasta...
pbmm2 align /Users/yuxinzhou/Documents/Telomere/telogator/resources/t2t-telogator-ref.fa sim_dir/sim_c-40_r-20000_a-0_m-0/sim.fa sim_dir/sim_c-40_r-20000_a-0_m-0/aln.bam --preset SUBREAD --sample sample --rg '@RG\tID:movie1' -j 6 -J 3 --sort
aligning reads with pbmm2...
>|> 20221006 15:00:18.700 -|- WARN -|- CheckPositionalArgs -|- 0x2049db600|| -|- Input is FASTA. Output BAM file cannot be used for polishing with GenomicConsensus!
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
exit codes: [1, 1, 1, 1, 1, 1]
usage: merge_jobs.py [-h] -i * in_dir/ [-r [hifi]] [-t [p90]] [-cd [2000]] [-cp [15]] [-rc [2]] [-ra [3]] [-ta [max]]
                     [-tc treecuts.tsv] [-th [0]] [-gt tlens.tsv] [-rl [50000]] [-k plot_kmers.tsv] [-m muscle] [--pbsim]
                     [--tel-color-plots] [--plot-denoised-tel] [--more-tlen-plots] [--more-readlen-plots] [--nucl-consensus]
                     [--telogator-pickle]
merge_jobs.py: error: unrecognized arguments: -cr 1
pbsim --model_qc /Users/yuxinzhou/Documents/Telomere/telogator/resources/model_qc_clr --data-type CLR --length-mean 20000 --length-sd 200 --length-min 19000 --length-max 21000 --accuracy-mean 0.95 --accuracy-sd 0.005 --depth 40 --prefix sim_dir/sim_c-40_r-20000_a-5_m-0/sim sim_dir/sim_c-40_r-20000_a-5_m-0/ref.fa > sim_dir/sim_c-40_r-20000_a-5_m-0/log.txt 2>&1
simulating reads...
consolidating reads into single fasta...
pbmm2 align /Users/yuxinzhou/Documents/Telomere/telogator/resources/t2t-telogator-ref.fa sim_dir/sim_c-40_r-20000_a-5_m-0/sim.fa sim_dir/sim_c-40_r-20000_a-5_m-0/aln.bam --preset SUBREAD --sample sample --rg '@RG\tID:movie1' -j 6 -J 3 --sort
aligning reads with pbmm2...
>|> 20221006 15:00:22.131 -|- WARN -|- CheckPositionalArgs -|- 0x204f80600|| -|- Input is FASTA. Output BAM file cannot be used for polishing with GenomicConsensus!
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
exit codes: [1, 1, 1, 1, 1, 1]
usage: merge_jobs.py [-h] -i * in_dir/ [-r [hifi]] [-t [p90]] [-cd [2000]] [-cp [15]] [-rc [2]] [-ra [3]] [-ta [max]]
                     [-tc treecuts.tsv] [-th [0]] [-gt tlens.tsv] [-rl [50000]] [-k plot_kmers.tsv] [-m muscle] [--pbsim]
                     [--tel-color-plots] [--plot-denoised-tel] [--more-tlen-plots] [--more-readlen-plots] [--nucl-consensus]
                     [--telogator-pickle]
merge_jobs.py: error: unrecognized arguments: -cr 1

So maybe the problem is still due to pbmm2?

Thanks again

zstephens commented 2 years ago

Did you pull the latest commit of the telogator repository? That looks like the syntax issue I corrected in commit 352b3cc89b8de63a31c19e898b09d551d50ffbe1

yuxinnnnnn commented 2 years ago

Sorry, I forgot to use the latest version.

With that, I got:

(test) yuxinzhou@Lluvia-MacBook-Pro:~/Downloads/telogator/simulations$ python simulate_reads.py -o sim_dir/ -c default.cfg
found index /Users/yuxinzhou/Downloads/telogator/resources/t2t-telogator-ref.fa.fai
reading default.cfg...
pbsim --model_qc /Users/yuxinzhou/Downloads/telogator/resources/model_qc_clr --data-type CLR --length-mean 20000 --length-sd 200 --length-min 19000 --length-max 21000 --accuracy-mean 1.0 --accuracy-sd 0.005 --depth 40 --prefix sim_dir/sim_c-40_r-20000_a-0_m-0/sim sim_dir/sim_c-40_r-20000_a-0_m-0/ref.fa > sim_dir/sim_c-40_r-20000_a-0_m-0/log.txt 2>&1
simulating reads...
consolidating reads into single fasta...
pbmm2 align /Users/yuxinzhou/Downloads/telogator/resources/t2t-telogator-ref.fa sim_dir/sim_c-40_r-20000_a-0_m-0/sim.fa sim_dir/sim_c-40_r-20000_a-0_m-0/aln.bam --preset SUBREAD --sample sample --rg '@RG\tID:movie1' -j 6 -J 3 --sort
aligning reads with pbmm2...
>|> 20221006 16:50:01.907 -|- WARN -|- CheckPositionalArgs -|- 0x205195600|| -|- Input is FASTA. Output BAM file cannot be used for polishing with GenomicConsensus!
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
Error reading -i input file
exit codes: [1, 1, 1, 1, 1, 1]
using default telomere kmers.
Error: no pickles found in input dir matching prefix: tel-data

So it changed frommerge_jobs.py: error: unrecognized arguments: -cr 1 to Error: no pickles found in input dir matching prefix: tel-data.

yuxinnnnnn commented 2 years ago

Hello, I would let you know that the issue is solved. Thanks!