tanlongzhi / dip-c

Tools to analyze Dip-C (or other 3C/Hi-C) data
62 stars 18 forks source link

Finding m independent models for further analysis? #5

Closed tarak77 closed 6 years ago

tarak77 commented 6 years ago

Hi Tan, The models look great and a single model is good for visualization purposes. I remember that in the paper you did analysis on some more replicates to get the results. And nuc_dynamics has -m options to define the number of models we wish to find. Thinking along the same lines, can we specify the number of independent models to be generated using dip-c/ hickit?? Say for each diploid single cells I want 100 models so as to do analysis on structural positioning or pairwise distance distributions between loci??

tanlongzhi commented 6 years ago

Hi Tarak,

Currently this repo does not provide an automatic function to generate replicates.

However, replicates can be easily generated by copying both impute.con.gz (the output of dip-c impute) and clean.con.gz (the output of an earlier step, dip-c clean) to many different folders, and run all the subsequent steps separately for each folder.

When running replicates, please make sure that the random seeds of nuc_dynamics (the 3D modeling tool used in this repo) are different for different replicate folders. If using the default seed of nuc_dynamics (based on time), please make sure to start different replicates at different times. This was how we did it for the Dip-C paper. Alternatively, if using an user-specified seed, please specify a different seed for each replicate folder.

lh3 commented 6 years ago

For hickit, -s sets the random seed. Just use something like

seq 100 | xargs -i echo hickit -s {} ... -O out-{}.3dg

to generate 100 command lines.