nservant / HiC-Pro

HiC-Pro: An optimized and flexible pipeline for Hi-C data processing
Other
382 stars 183 forks source link

How to use HiTC to scaffold genome, general methodology after HiC-Pro #305

Closed rob123king closed 4 years ago

rob123king commented 4 years ago

I have run HiC-Pro, loaded the normalised matrix into R using HiTC but of my 55 scaffolds, there should be 24 chromosomes, I still have a few more large scaffolds to place and check no miss-assemblies. I can't see how to use the software to get more than some plots?

If just manually it could point to a connection which I could just manually look at and put together then that would be fine, but can't see where that information is? can't just be pretty pictures..

Otherwise I can try to get Juicer working to map the "valid reads" from HiC-Pro. First I'm not sure how to extract only the valid reads? Then once mapped using Juicer, then go on to 3d-DNA to scaffold.

Obviously first HiC data set and used Salsa2 but assume that more advanced software will give better results..

I've performed conversion to juicer and viewed in windows.

rob123king commented 4 years ago

image This is one scaffold, doe sit look like I need to break this into three? As when compare this with two others it seems that the 2 big parts go elsewhere image the 2 splits of one contig are the above scaffold

nservant commented 4 years ago

HiC-Pro and HiTC was not designed to perform genome assembly. So I'm not sure they can really help you. Instead I would move forward with a dedicated tool as instaGRAAL for instance. If you want to use the valid pairs, you can use the 'allValidPairs" file in the data folder. Regarding your second post, I agree with all your comments, but I'm not an expert in assembly. Best