nservant / HiC-Pro

HiC-Pro: An optimized and flexible pipeline for Hi-C data processing
Other
386 stars 182 forks source link

plot_pairing_portion.R issue #396

Closed kuba11 closed 3 years ago

kuba11 commented 3 years ago

Hello,

I tried analysing my data from the Hi-C EpiTect kit, but I got stuck on the 'quality_checks' step with an error: 'make: *** [/mnt/ssd/ssd_2/home/243306/hic/HiC-Pro_3.0.0/bin/../scripts//Makefile:181: hic_qc] Error 1'. Digging deeper I realised that the issue is in the script 'plot_pairing_portion.R' in the line 57: 'stopifnot(x[un.lab]+x[reported.lab]+x[allnotreported.lab]==x["Total_pairs_processed"])', meaning that the reads do not add up to 100%.

Here are the mapping statistics I got: Total_pairs_processed: 8717364 Unmapped_pairs: 136386 Low_qual_pairs: 2240898 Unique_paired_alignments: 5196524 Multiple_pairs_alignments: 0 Pairs_with_singleton: 1143556 Low_qual_singleton: 316874 Unique_singleton_alignments: 826682 Multiple_singleton_alignments: 0 Reported_pairs: 6023206

If we subtract from the total number of pairs (8717364) reported, unreported(low qual) and unmapped we get 8717364-6023206-2240898-136386=316874, which is the number of Low_qual_singleton (which is not included when summing the reads during the check). Do you know what I may have done wrong during the analysis? I was using the latest version (3.0.0) of HiC-Pro. I managed to run the test data, but it has 0 Low_qual_singleton, so this issue didn't occur. Let me know what other information is needed.

Thank you, Kuba

nservant commented 3 years ago

Hi Kuba Could you show me how you set up the option for singleton and multiple hits in the config please ? N

kuba11 commented 3 years ago

Hello,

Thank you for your answer, here are the options:

MIN_CIS_DIST = 1000 GET_ALL_INTERACTION_CLASSES = 1 GET_PROCESS_SAM = 0 RM_SINGLETON = 0 RM_MULTI = 0 RM_DUP = 1

nservant commented 3 years ago

Hi I think it may have a bug with RM_SINGLETON=0. Could you fix it to 1 and try again ? I should think about removing this option as it does not really make sense. In any case, singleton cannot be used to build the final contact maps, as by definition, they do not reflect any interaction. So this option just allows to remove the singletons earlier in the process (when we merge the two bam files), or latter on (when we build the map), but the final result is the same. Thanks

kuba11 commented 3 years ago

Hi, Thank you, I'll check it out and let you know how are the results. Kuba

kuba11 commented 3 years ago

Hi, Just confirming that your suggestion solved the issue, thank you. Kuba