ganlab / GALA

Long-reads Gap-free Chromosome-scale Assembler
MIT License
70 stars 17 forks source link

Step-by-Step Mode: newgenome: cut_file error #3

Closed Ural-Yunusbaev closed 3 years ago

Ural-Yunusbaev commented 4 years ago
cat draft_names_paths.txt
draft_1=/home/crciv/AcerChrAssemb/Pilon/Ra_assembly_Pilon_polished/Ra_assembly_Pilon_polished.fasta
draft_2=/home/crciv/AcerChrAssemb/Pilon/Flye_assembly_Pilon_polished/Flye_assembly_Pilon_polished.fasta
draft_3=/home/crciv/AcerChrAssemb/NextPolish/Acer_data/01_rundir/02.kmer_count/05.polish.ref.sh.work/genome.nextpolish.part000_part001.fasta
$ comp draft_names_paths.txt
comp 1.0.0
$ sh draft_comp.sh
[M::mm_idx_gen::4.388*1.18] collected minimizers
[M::mm_idx_gen::4.849*1.35] sorted minimizers
[M::main::4.849*1.35] loaded/built the index for 119 target sequence(s)
[M::mm_mapopt_update::5.107*1.33] mid_occ = 100
[M::mm_idx_stat] kmer size: 19; skip: 19; is_hpc: 0; #seq: 119
[M::mm_idx_stat::5.299*1.32] distinct minimizers: 19721415 (95.65% are singletons); average occurrences: 1.106; average spacing: 9.935
[M::worker_pipeline::21.596*2.28] mapped 226 sequences
[M::main] Version: 2.14-r892-dirty
[M::main] CMD: minimap2 -x asm5 /home/crciv/AcerChrAssemb/Pilon/Ra_assembly_Pilon_polished/Ra_assembly_Pilon_polished.fasta /home/crciv/AcerChrAssemb/Pilon/Flye_assembly_Pilon_polished/Flye_assembly_Pilon_polished.fasta
[M::main] Real time: 21.671 sec; CPU: 49.280 sec; Peak RSS: 1.685 GB
[M::mm_idx_gen::4.040*1.31] collected minimizers
[M::mm_idx_gen::4.543*1.49] sorted minimizers
[M::main::4.543*1.49] loaded/built the index for 119 target sequence(s)
[M::mm_mapopt_update::4.803*1.47] mid_occ = 100
[M::mm_idx_stat] kmer size: 19; skip: 19; is_hpc: 0; #seq: 119
[M::mm_idx_stat::4.994*1.45] distinct minimizers: 19721415 (95.65% are singletons); average occurrences: 1.106; average spacing: 9.935
[M::worker_pipeline::23.551*2.32] mapped 64 sequences
[M::main] Version: 2.14-r892-dirty
[M::main] CMD: minimap2 -x asm5 /home/crciv/AcerChrAssemb/Pilon/Ra_assembly_Pilon_polished/Ra_assembly_Pilon_polished.fasta /home/crciv/AcerChrAssemb/NextPolish/Acer_data/01_rundir/02.kmer_count/05.polish.ref.sh.work/genome.nextpolish.part000_part001.fasta
[M::main] Real time: 23.626 sec; CPU: 54.694 sec; Peak RSS: 1.651 GB
[M::mm_idx_gen::4.043*1.26] collected minimizers
[M::mm_idx_gen::4.497*1.43] sorted minimizers
[M::main::4.497*1.43] loaded/built the index for 226 target sequence(s)
[M::mm_mapopt_update::4.747*1.41] mid_occ = 100
[M::mm_idx_stat] kmer size: 19; skip: 19; is_hpc: 0; #seq: 226
[M::mm_idx_stat::4.936*1.39] distinct minimizers: 19752353 (96.20% are singletons); average occurrences: 1.101; average spacing: 9.938
[M::worker_pipeline::16.736*2.42] mapped 119 sequences
[M::main] Version: 2.14-r892-dirty
[M::main] CMD: minimap2 -x asm5 /home/crciv/AcerChrAssemb/Pilon/Flye_assembly_Pilon_polished/Flye_assembly_Pilon_polished.fasta /home/crciv/AcerChrAssemb/Pilon/Ra_assembly_Pilon_polished/Ra_assembly_Pilon_polished.fasta
[M::main] Real time: 16.764 sec; CPU: 40.572 sec; Peak RSS: 1.458 GB
[M::mm_idx_gen::4.037*1.26] collected minimizers
[M::mm_idx_gen::4.507*1.43] sorted minimizers
[M::main::4.507*1.43] loaded/built the index for 226 target sequence(s)
[M::mm_mapopt_update::4.781*1.41] mid_occ = 100
[M::mm_idx_stat] kmer size: 19; skip: 19; is_hpc: 0; #seq: 226
[M::mm_idx_stat::4.981*1.39] distinct minimizers: 19752353 (96.20% are singletons); average occurrences: 1.101; average spacing: 9.938
[M::worker_pipeline::22.634*2.40] mapped 64 sequences
[M::main] Version: 2.14-r892-dirty
[M::main] CMD: minimap2 -x asm5 /home/crciv/AcerChrAssemb/Pilon/Flye_assembly_Pilon_polished/Flye_assembly_Pilon_polished.fasta /home/crciv/AcerChrAssemb/NextPolish/Acer_data/01_rundir/02.kmer_count/05.polish.ref.sh.work/genome.nextpolish.part000_part001.fasta
[M::main] Real time: 22.706 sec; CPU: 54.389 sec; Peak RSS: 1.668 GB
[M::mm_idx_gen::4.049*1.32] collected minimizers
[M::mm_idx_gen::4.547*1.49] sorted minimizers
[M::main::4.547*1.49] loaded/built the index for 64 target sequence(s)
[M::mm_mapopt_update::4.818*1.47] mid_occ = 100
[M::mm_idx_stat] kmer size: 19; skip: 19; is_hpc: 0; #seq: 64
[M::mm_idx_stat::5.015*1.45] distinct minimizers: 19800975 (95.79% are singletons); average occurrences: 1.138; average spacing: 9.943
[M::worker_pipeline::15.880*2.38] mapped 119 sequences
[M::main] Version: 2.14-r892-dirty
[M::main] CMD: minimap2 -x asm5 /home/crciv/AcerChrAssemb/NextPolish/Acer_data/01_rundir/02.kmer_count/05.polish.ref.sh.work/genome.nextpolish.part000_part001.fasta /home/crciv/AcerChrAssemb/Pilon/Ra_assembly_Pilon_polished/Ra_assembly_Pilon_polished.fasta
[M::main] Real time: 15.901 sec; CPU: 37.778 sec; Peak RSS: 1.474 GB
[M::mm_idx_gen::4.095*1.32] collected minimizers
[M::mm_idx_gen::4.590*1.49] sorted minimizers
[M::main::4.590*1.49] loaded/built the index for 64 target sequence(s)
[M::mm_mapopt_update::4.891*1.46] mid_occ = 100
[M::mm_idx_stat] kmer size: 19; skip: 19; is_hpc: 0; #seq: 64
[M::mm_idx_stat::5.118*1.44] distinct minimizers: 19800975 (95.79% are singletons); average occurrences: 1.138; average spacing: 9.943
[M::worker_pipeline::19.194*2.24] mapped 226 sequences
[M::main] Version: 2.14-r892-dirty
[M::main] CMD: minimap2 -x asm5 /home/crciv/AcerChrAssemb/NextPolish/Acer_data/01_rundir/02.kmer_count/05.polish.ref.sh.work/genome.nextpolish.part000_part001.fasta /home/crciv/AcerChrAssemb/Pilon/Flye_assembly_Pilon_polished/Flye_assembly_Pilon_polished.fasta
[M::main] Real time: 19.270 sec; CPU: 43.076 sec; Peak RSS: 1.610 GB
$ ls comparison
draft_1vsdraft_2.paf  draft_1vsdraft_3.paf  draft_2vsdraft_1.paf  draft_2vsdraft_3.paf  draft_3vsdraft_1.paf  draft_3vsdraft_2.paf
$ mdm comparison 3
mdm 1.0.0
newgenome draft_names_paths.txt cut_folder
Traceback (most recent call last):
  File "/home/crciv/soft/gala/newgenome", line 28, in <module>
    genomes(genomes=draft,gathering=gathering,gathering_name=name,outpath=output)
  File "/home/crciv/soft/gala/src/new_genome.py", line 82, in genomes
    b=new_genome(cut_file=gathering+gathering_name+'_'+a+'_cuts.txt',old_genome=aa,out_path=outpath,name='new_'+a)
  File "/home/crciv/soft/gala/src/new_genome.py", line 11, in new_genome
    a=list(open(cut_file))
IOError: [Errno 2] No such file or directory: 'gathering/new_genome_draft_1_cuts.txt'
mawad89 commented 4 years ago

Run: newgenome draft_names_paths.txt cut_folder -f gathering

Ural-Yunusbaev commented 4 years ago
$ newgenome draft_names_paths.txt cut_folder -f gathering
Traceback (most recent call last):
  File "/home/crciv/soft/gala/newgenome", line 28, in <module>
    genomes(genomes=draft,gathering=gathering,gathering_name=name,outpath=output)
  File "/home/crciv/soft/gala/src/new_genome.py", line 82, in genomes
    b=new_genome(cut_file=gathering+gathering_name+'_'+a+'_cuts.txt',old_genome=aa,out_path=outpath,name='new_'+a)
  File "/home/crciv/soft/gala/src/new_genome.py", line 11, in new_genome
    a=list(open(cut_file))
IOError: [Errno 2] No such file or directory: 'cut_folder/gathering_draft_1_cuts.txt'
Ural-Yunusbaev commented 4 years ago

There is no cut_folder folder and no gathering_draft_1_cuts.txt file in it. Also, no gathering_draft_1_cuts.txt file in the gathering folder. I think something went wrong in the previous step.

$ ls -alh
total 32K
drwxrwxr-x  4 crciv crciv  267 May 20 11:37 .
drwxrwxr-x 33 crciv crciv 4.0K May 18 18:39 ..
drwxrwxr-x  2 crciv crciv  202 May 19 16:35 comparison
-rw-rw-r--  1 crciv crciv  727 May 19 16:32 draft_comp.sh
-rw-rw-r--  1 crciv crciv  345 May 18 19:16 draft_names_paths.txt
drwxrwxr-x  2 crciv crciv  223 May 19 16:38 gathering
-rw-r--r--  1 crciv crciv    0 May 20 11:30 new_draft_names_paths.txt
$ ls -l comparison/
total 1824
-rw-rw-r-- 1 crciv crciv 216444 May 19 16:34 draft_1vsdraft_2.paf
-rw-rw-r-- 1 crciv crciv 532982 May 19 16:34 draft_1vsdraft_3.paf
-rw-rw-r-- 1 crciv crciv 113515 May 19 16:35 draft_2vsdraft_1.paf
-rw-rw-r-- 1 crciv crciv 526792 May 19 16:35 draft_2vsdraft_3.paf
-rw-rw-r-- 1 crciv crciv 166288 May 19 16:35 draft_3vsdraft_1.paf
-rw-rw-r-- 1 crciv crciv 301889 May 19 16:36 draft_3vsdraft_2.paf
$ ls -l gathering/
total 36
-rw-rw-r-- 1 crciv crciv  2426 May 19 16:38 gathering_draft_1.txt
-rw-rw-r-- 1 crciv crciv   755 May 19 16:38 gathering_draft_1_cuts.txt
-rw-rw-r-- 1 crciv crciv  2273 May 19 16:38 gathering_draft_2.txt
-rw-rw-r-- 1 crciv crciv   480 May 19 16:38 gathering_draft_2_cuts.txt
-rw-rw-r-- 1 crciv crciv 10161 May 19 16:38 gathering_draft_3.txt
-rw-rw-r-- 1 crciv crciv  4686 May 19 16:38 gathering_draft_3_cuts.txt
mawad89 commented 4 years ago

gathering_draft_1_cuts.txt I saw it in gathering directory. So your cut_folder named gathering You just need to re-run newgenome command using gathering as a cut_folder

mawad89 commented 4 years ago

@Ural-Yunusbaev I didn't hear from you long time ago Is it run with you correctly or you still have problems?

Ural-Yunusbaev commented 4 years ago

I have failed several times and gave up. Did you update the script? And I would like to try your toy files. If you did, I would like to try it again.

mawad89 commented 4 years ago

Hi; You can try the new commit and this toy just add the gala path in run.sh and run it

tell me if you still have problems Best wishes Mohamed

mawad89 commented 3 years ago

Hi; Any updates