zengxiaofei / HapHiC

HapHiC: a fast, reference-independent, allele-aware scaffolding tool based on Hi-C data
https://www.nature.com/articles/s41477-024-01755-3
BSD 3-Clause "New" or "Revised" License
140 stars 10 forks source link

Haphic plot question #48

Closed majssssa closed 2 months ago

majssssa commented 3 months ago

Hi,

When I ran "/path/to/HapHiC/haphic plot out_JBAT.FINAL.agp HiC.filtered.bam", a very Hi-C contact maps appeared. Oddly, this is not the case in Juicebox. contact_map.pdf

zengxiaofei commented 3 months ago

In some cases, haphic plot may not able to get a suitable color range automatically due to a strong background of inter-scaffold Hi-C signals. In such scenarios, please set the color range manually using the --manual_vmax parameter (e.g., you may try 0.00001 or higher in your case).

zengxiaofei commented 3 months ago

Note: If you only change the --manual_vmax parameter, you do not need to rerun haphic plot using the BAM file. Instead, you can just use the contact_matrix.pkl file instead, which is super fast. Therefore, you can fine-tune the --manual_vmax parameter multiple times until you achieve the desired contact map.

majssssa commented 3 months ago

Thank you. That problem has been solved. Another small problem is that after modification by Juicebox, there will be an extra Scaffold. Can we keep the extra scaffold deleted when using Haplot? The Scaffold_ contact_map.pdf 19 in the figure below

zengxiaofei commented 3 months ago

We recommend not to link these unanchored contigs into a scaffold (chrUn) as you have done, because it is not a true scaffold and could be misleading. Instead, we prefer to separately list these contigs after the true chromosomes (chr1-18). After that, you can easily exclude these short unanchored sequences from the contact map using the --min_len parameter.

Currently, haphic plot does not support the removal of specific sequences by solely modifying the AGP file, which is intentional to prevent users from inputting mismatched AGP files and BAM files. If you still decide to present the unanchored sequences using the chrUN, you need to filter out scaffold_19 in both the AGP file and the BAM file.

I plan to add a feature that allows users to manually specify which sequences will be included in the contact map. However, this will take some time as I am quite busy this month.

majssssa commented 3 months ago

Thank you for your reply. My problem has been solved and I have made a heat map that I am satisfied with.

zengxiaofei commented 2 months ago

I have added a new parameter --specified_scaffolds in haphic plot in the latest commit . You can specify the scaffolds to visualize by using something like: --specified_scaffolds "group1,group2,group3,group4"