deeptools / HiCExplorer

HiCExplorer is a powerful and easy to use set of tools to process, normalize and visualize Hi-C data.
https://hicexplorer.readthedocs.org
GNU General Public License v3.0
233 stars 70 forks source link

Recommended preprocessing before Hicexplorer #370

Closed re2srm closed 5 years ago

re2srm commented 5 years ago

Hello,

Some HiC analysis tools (like homer) recommend trimming each read of the pair around the restriction site to remove sequences that might come from the proximity ligation of other regions of the genome to the DNA fragment. Is such a trimming step around the restriction site (or general quality trimming) recommended before processing with HiCexplorer?

Also are standard qc steps like adapter trimming, pcr duplicate removal etc recommended before using hicxplorer.

Thanks

LeilyR commented 5 years ago

As far as I know hicBuildMatrix takes care of them. check out --minDistance, --maxLibraryInsertSize, --minMappingQuality and --keepSelfCircles from hicBuildMatrix -h. Also it generates a qc folder with several reports.

re2srm commented 5 years ago

Thanks for the reply. My confusion with this is that the hicBuildMatrix function is used after alignment and any trimming according to my understanding should ideally happen before alignment.

joachimwolff commented 5 years ago

Hi,

what we need is that the order of the reads is never changed at any point in the preprocessing step. Adapter trimming can be useful to improve the mapping quality; PCR duplicates, read quality and self circles are removed by hicBuildMatrix.

Best,

Joachim

re2srm commented 5 years ago

Great. Thanks!

re2srm commented 5 years ago

Sorry for reopening the issue but I have one more basic question about preprocessing. I have aligned my reads and will use samtools to merge the bam files for the technical replicates. Since you mentioned that the order of the reads should not change at any point, I was wondering if merging these files would be a problem and if I should have concatenated the fastq files before alignment?

Also instead of merging the bam files for the technical replicates, can I build matrices for all of them and then merge them with the hicSumMatrices function (or is that function only used for merging biological replicates)?

Thanks

joachimwolff commented 5 years ago

Build the matrices and than apply hicSumMatrices.

re2srm notifications@github.com schrieb am Do. 11. Apr. 2019 um 07:53:

Sorry for reopening the issue but I have one more basic question about preprocessing. I have aligned my reads and will use samtools to merge the bam files for the technical replicates. Since you mentioned that the order of the reads should not change at any point, I was wondering if merging these files would be a problem and if I should have concatenated the fastq files before alignment?

Also instead of merging the bam files for the technical replicates, can I build matrices for all of them and then merge them with the hicSumMatrices function (or is that function only used for merging biological replicates)?

Thanks

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/deeptools/HiCExplorer/issues/370#issuecomment-481975468, or mute the thread https://github.com/notifications/unsubscribe-auth/AM0BADxnvVXhSv5jPSQQ4zPRXt_GrDvAks5vfs3VgaJpZM4ckRN0 .

-- Joachim Wolff M.Sc. Computer Science Chair for Bioinformatics Department of Computer Science Albert-Ludwigs-University Freiburg Georges-Koehler-Allee 079 D-79110 Freiburg

http://www.bioinf.uni-freiburg.de

re2srm commented 5 years ago

Thanks again. I will merge both technical and biological replicates together.