c-zhou / yahs

Yet another Hi-C scaffolding tool
MIT License
131 stars 19 forks source link

Empty alignments_sorted.txt #62

Open spaddys opened 1 year ago

spaddys commented 1 year ago

Hello, Hello and thanks for creating yahs!

I'm running the following line in order to generate the alignments file using juicer pre and it seems to run fine but the .txt file is empty.

(yahs/juicer pre yahs.out.bin yahs.out_scaffolds_final.agp hifiasm.asm.hic.hap1.p_ctg.fasta.fai | sort -k2,2d -k6,6d -T ./ --parallel=8 -S32G | awk 'NF' >alignments_sorted.txt.part) && (mv alignments_sorted.txt.part alignments_sorted.txt)

I've looked in the log file and there doesn't seem to be an error when I run it (I've included the last couple lines of it below) but it's not generating the correct file, do you have a solution?

PRE_C_SIZE: scaffold_934 1000 PRE_C_SIZE: scaffold_935 1000 PRE_C_SIZE: scaffold_936 1000 PRE_C_SIZE: scaffold_937 1000 PRE_C_SIZE: scaffold_938 1000 [I::main_pre] Version: 1.1 [I::main_pre] CMD: juicer pre yahs.out.bin yahs.out_scaffolds_final.agp hifiasm.asm.hic.hap1.p_ctg.fasta.fai [I::main_pre] Real time: 0.283 sec; CPU: 0.065 sec; Peak RSS: 0.001 GB

Any help is much appreciated!

c-zhou commented 1 year ago

Hello @spaddys,

It is hard to tell what the problem is. The log file shows the program finished successfully. You could try to separate the juicer pre step from the sorting step, i.e., to run yahs/juicer pre -o alignments.txt yahs.out.bin yahs.out_scaffolds_final.agp hifiasm.asm.hic.hap1.p_ctg.fasta.fai first, and see what will happen. The command should generate an alignments.txt file.

Best, Chenxi

spaddys commented 1 year ago

Hello again!

I just tried running that command and the alignments.txt file was not generated and still no error message.

spaddys commented 1 year ago

Would it be helpful if I emailed you the files that I’m using to see if you can see a problem I’m not seeing?

surabhiranavat commented 1 year ago

I'm facing the same issue, and even tried what @c-zhou suggested, but my alignments.txt file is empty. Did anyone figure out a solution? I'm using YaHS v1.2.

Thanks, Surabhi

spaddys commented 1 year ago

Yes! The problem was the heading on my fasta files, yahs requires them to have a certain heading format to work correctly.

surabhiranavat commented 1 year ago

I just used the BAM file instead of the BIN file for juicer pre, that works as well, so I guess my problem lies in the BIN file.

ColinR01 commented 1 year ago

hello, I also encountered the same problem,the out_JBAT.txt is empty, before analysis, I sorted the bam file: 'samtools sort -@ 32 -n' and the head line of bam is: E150019454L1C001R00100000208/1 83 ptg000001l 87754 24 90M1I59M = 100665 E150019454L1C001R00100000208/2 163 ptg000001l 100665 24 95M = 87754 52774 E150019454L1C001R00100000289/1 67 ptg000001l 4963315 18 150M = 2180360 35175 E150019454L1C001R00100000289/2 131 ptg000001l 2180360 18 30M2D50M = 4963315 E150019454L1C001R00100000566/1 99 ptg000002l 1039843 31 66M3D11M ptg000007l E150019454L1C001R00100000566/2 147 ptg000007l 2744974 31 3M1I90M ptg000002l 1039843 E150019454L1C001R00100000611/1 67 ptg000007l 2816266 25 150M ptg000001l 4506733 E150019454L1C001R00100000611/2 131 ptg000001l 4506733 25 136M ptg000007l 2816266 E150019454L1C001R00100000735/1 99 ptg000001l 2399516 24 150M = 2404376 4891 E150019454L1C001R00100000735/2 147 ptg000001l 2404376 24 31M = 2399516 -4891 E150019454L1C001R00100000830/1 67 ptg000003l 1451348 26 136M ptg000007l 446094 E150019454L1C001R00100000830/2 131 ptg000007l 446094 26 87M ptg000003l 1451348 E150019454L1C001R00100000877/1 83 ptg000003l 3123519 24 150M = 3124686 -64519 E150019454L1C001R00100000877/2 163 ptg000003l 3124686 24 31M = 3123519 64519 E150019454L1C001R00100000967/1 67 ptg000007l 1281769 25 99M1I1M ptg000004l 2131059 E150019454L1C001R00100000967/2 131 ptg000004l 2131059 25 150M ptg000007l 1281769 E150019454L1C001R00100000999/1 115 ptg000003l 1677661 22 150M = 1681471 -61876 E150019454L1C001R00100000999/2 179 ptg000003l 1681471 22 111M = 1677661 -61876 E150019454L1C001R00100001084/1 115 ptg000008l 638702 18 64M3I4M = 641928 -62378 E150019454L1C001R00100001084/2 179 ptg000008l 641928 18 3M4D4M1I142M = 638702 E150019454L1C001R00100001154/1 99 ptg000001l 3684346 25 150M = 3685413 1114 E150019454L1C001R00100001154/2 147 ptg000001l 3685413 25 47M = 3684346 -1114 E150019454L1C001R00100001205/1 99 ptg000007l 756227 27 77M ptg000008l 4137589 E150019454L1C001R00100001205/2 147 ptg000008l 4137589 27 5M1I119M ptg000007l E150019454L1C001R00100001221/1 67 ptg000001l 2838662 22 50M ptg000002l 961962 E150019454L1C001R00100001221/2 131 ptg000002l 961962 22 150M ptg000001l 2838662 E150019454L1C001R00100001386/1 99 ptg000005l 5002459 22 150M = 5002493 184 E150019454L1C001R00100001386/2 147 ptg000005l 5002493 22 150M = 5002459 -184

Any help is much appreciated!

ColinR01 commented 1 year ago

@c-zhou If you see the problem I encountered above, please give me some suggestions, I will appreciated it.

c-zhou commented 1 year ago

Hello @ColinR01,

Your BAM file is not correct. In a BAM file, two reads from the same pair should have identical names, i.e., no /1 or /2 suffix in your case. Please see issue 47 for more details.

Best, Chenxi