marbl / SALSA

SALSA: A tool to scaffold long read assemblies with Hi-C data
MIT License
182 stars 47 forks source link

Bad result because low quality Hi-C library? #119

Open ptranvan opened 3 years ago

ptranvan commented 3 years ago

Hi, Thanks for your software. I ran SALSA but unfortunately I didn't have satisfying result.

I applied the Arima mapping pipeline and got this statistics before SALSA:

perl $STATS $REP_DIR/$REP_LABEL.bam > $REP_DIR/$REP_LABEL.bam.stats

cat $REP_DIR/$REP_LABEL.bam.stats

All     50585700
All intra       47453287
All intra 1kb   3798461
All intra 10kb  2498813
All intra 15kb  2338248
All intra 20kb  2223609
All inter       3132413

My opinion is that I don't have enough PE for All intra 20kb. I am not sure but what do you think about this library ? ( I have a very good contig assembly of 1.3G species, 750 contigs, busco: 98%)

ghuryejay commented 3 years ago

What's the input N50 of the assembly? Also, you would want to look at ALL_INTER as those are the inter-contig links that will be used for scaffolding. I do agree that it might be an artifact of the bad library.