XiaoTaoWang / HiC_pipeline

An easy-to-use Hi-C data processing software supporting distributed computation.
http://xiaotaowang.github.io/HiC_pipeline/index.html
GNU General Public License v3.0
53 stars 20 forks source link

Running with multiple RE's #11

Open cbarcl01 opened 1 year ago

cbarcl01 commented 1 year ago

Hi,

I was just wondering how to set up the datasets.tsv for an experiment where multiple restriction enzymes are used? Can I just delineate with a comma e.g. SRR027956 GM06990 R1 HindIII,MseI,DdeI,DpnII.

Thanks for sharing this pipeline and tutorial - super helpful!

XiaoTaoWang commented 1 year ago

Hi, thanks for your positive feedback! The current version of runHiC does not support specifying multiple restriction enzymes in the "datasets.tsv" file. However, I agree that this would be a useful feature, and I will definitely incorporate it in the next update of runHiC.

If you don't care about the fragment-level filtering, such as dangling reads and self-ligation, you can simply skip this step by not including the "--add-frag" parameter in your command. In this case, you can set the enzyme name to any arbitrary value for your record:

SRR027956 GM06990 R1 cocktail
cbarcl01 commented 1 year ago

Hi @XiaoTaoWang - thanks for the speedy reply!

Ideally I do need to do fragment level filtering. Do you think there is a way to run this iteratively and then combine? The restriction enzymes used were HindIII MseI DdeI and DpnII