Closed lijphd168866 closed 1 week ago
Hello @lijphd168866,
Which version of HapHiC are you using? If you are using an older version of HapHiC, please update to the new version, as I fixed an important bug on July 17th. Additionally, you can check the log file output when running haphic refsort
. It will show the correspondence between scaffold (group) names and reference genome chromosomes. haphic refsort
will regenerate the AGP file based on the order and orientation of the reference genome chromosomes, but it will not rename scaffolds. Therefore, you still need to rename them based on this correspondence. Lastly, the region you pointed out in your figure will not be processed by haphic refsort
because it is within a scaffold. Please note that we have emphasized it in our documentation:
This function is NOT reference-based scaffolding and will NOT alter your scaffolds, it only changes the way of presentation through overall ordering and orientation of the entire scaffolds.
Best regards, Xiaofei
Hello @lijphd168866,
Which version of HapHiC are you using? If you are using an older version of HapHiC, please update to the new version, as I fixed an important bug on July 17th. Additionally, you can check the log file output when running
haphic refsort
. It will show the correspondence between scaffold (group) names and reference genome chromosomes.haphic refsort
will regenerate the AGP file based on the order and orientation of the reference genome chromosomes, but it will not rename scaffolds. Therefore, you still need to rename them based on this correspondence. Lastly, the region you pointed out in your figure will not be processed byhaphic refsort
because it is within a scaffold. Please note that we have emphasized it in our documentation:This function is NOT reference-based scaffolding and will NOT alter your scaffolds, it only changes the way of presentation through overall ordering and orientation of the entire scaffolds.
Best regards, Xiaofei
Hello, Teacher Zeng:
I'm glad you replied so quickly. The version of haphic refsort I'm using is HapHiC version: 1.0.6 (update: 2024.09.10).Then I checked the contents of log file output haphic refsort
:
However, the correspondence between the scaffold (group) name and the reference genome chromosome displayed in haphic refsort is inconsistent with that displayed in mummer. My mummer result is as follows:
Looking forward to your reply very much!
I can see a clear "one group to one chromosome" pattern in the log, so I believe the result should be good. However, I am unsure how you converted the group names (e.g., group1, group2) to scaffold names (e.g., scaffold1, scaffold2). It is possible that the names do not correspond directly based on their numbers. For example, did you adjust the ordering of scaffolds in Juicebox? To determine whether the issue lies with haphic refsort
or is caused by your adjustment in Juicebox, it is advisable to re-run the MUMmer alignment using the scaffolds.fa
from the 04.build
directory.
Close this issue as there has been no response for two weeks.
Dear Teacher Zeng: I use the Run HapHiC scaffolding pipeline for building pseudomolecules,My script is as follows: ~/HapHiC/haphic pipeline \ ~/Musa_analysis/06_polish/01_rundir/genome.nextpolish.fasta \ ba.HiC.filtered.bam 11 \ --correct_nrounds 2 --threads 48 --processes 48
”genome.nextpolish.fasta “ is the " hifiasm. asm. hic. pc_ctg.fa" file that I assembled using hifiasm. I pruge (purge_rups) and polish (Nextpolish) from the hifiasm assembled hifiasm. asm. hic. pc_ctg.fa;
Then I want to use the reference genomes of closely related species to sort my genome, My script is as follows: minimap2 -x asm20 Mch.genomes.chrrenamed.fa \ ~/Musa_analysis/06_polish/01_rundir/genome.nextpolish.fasta \ --secondary=no -t 48 -o ./ba.asm_to_ref.paf
~/Musa_analysis/07_haphic/04.build/scaffolds.raw.agp ./ba.asm_to_ref.paf
Then I used the generated 'ba. scawfolds. refsort. agp' file to modify juicebox.sh:
ln -s /home/lijia/Musa_analysis/06_polish/01_rundir/genome.nextpolish.fasta . samtools faidx genome.nextpolish.fasta /home/lijia/biosoft/HapHiC/scripts/../utils/juicer pre -a -q 1 -o out_JBAT_ref /home/lijia/Musa_analysis/07_haphic/ba.HiC.filtered.bam /home/lijia/Musa_analysis/07_haphic/05.refsort/ba.scaffolds.refsort.agp genome.nextpolish.fasta.fai >out_JBAT_ref.log 2>&1 (java -jar -Xmx32G /home/lijia/biosoft/HapHiC/scripts/../utils/juicer_tools.1.9.9_jcuda.0.8.jar pre out_JBAT_ref.txt out_JBAT_ref.hic.part <(cat out_JBAT_ref.log | grep PRE_C_SIZE | awk '{print $2" "$3}')) && (mv out_JBAT_ref.hic.part out_JBAT_ref.hic)
and run juicebox.sh generate the .assembly and .hic files;
I haven't adjusted much in juicebox. I generate the final FASTA file for the scaffolds:
~/HapHiC/utils/juicer post -o ba.out_JBAT out_JBAT_ref.review.assembly out_JBAT_ref.liftover.agp genome.nextpolish.fasta;
Finally, I discovered that the genome was not sorted according to the reference chromosome;
Could you give me some advice? Looking forward to your reply very much!