PouletAxel / SIP

SIP: Significant Interaction Peak caller
GNU General Public License v3.0
13 stars 3 forks source link

Explanation of the isDroso = True Parameter? #13

Closed gdolsten closed 3 years ago

gdolsten commented 3 years ago

Hi, do you have anywhere an explanation of the isDroso=True parameter? I haven't been able to find it in the paper

PouletAxel commented 3 years ago

Hi, Yes the explanation is in the wiki : -isDroso:

Default is false. Set this option to apply a specific filter due to looping characteristics in D. mel. You can used this parameter if your HiC map is similar to Drosophila one, where loops do not show the same decay as in human cells. Setting this parameter removes the regional enrichment filter. Best Axel

gdolsten commented 3 years ago

Would you be able to provide a short description of how this is working under the hood? Also, would it be possible for you to provide links to the Drosophila data used in the paper? Is this taken from Rowley 2019?

jordrow commented 3 years ago

The default action of SIP is to require that the central pixel be highest, while the signal 1 pixel away is less, and signal 2 pixels away is even less (average at each manhattan distance). Essentially, center>1away>2away. isDroso eliminates this filter and simply says the center has to be the highest pixel. I apologize if this wasn't clear. To facilitate your use, here's a link to the combined Drosophila .hic file: https://www.dropbox.com/s/4kuk1a8eu4xxc67/Kc_allcombined.hic?dl=0. Alternatively, you can find the reference for the Drosophila Hi-C maps in the supplement: "Loops called in Drosophila cells used Kc167 Hi-C maps combined from GSE80702 (Cubeñas-Potts et al. 2016) and GSE89112 (Eagen et al. 2017) genome build dm6". I believe these can also be found on the Juicebox repository.

gdolsten commented 3 years ago

Got it, thank you guys for being so helpful!

I am also working with a Drosophila data set and having a lot of difficulty calling loops, so trying with your data set first will be a good way to get some sort of baseline.

One suggestion I would make for the purposes of parameter selection is that it would be nice if your program had a verbose option, where it printed out the intermediate processed images for a certain stretch of the genome –– this would be very useful in parameter selection so that one could see what the effects of different parameters were for different loops, and why some loops were not being picked up!

PouletAxel commented 3 years ago

Thanks for the suggestion

Maybe that can help you: If you use -del false you will not delete the intermediary images use to call the loops. That could help you to manage to find the good parameter for your data set using imageJ/fiji

gdolsten commented 3 years ago

Oh wow, I must have missed that parameter description, I see it now. Thank you so much! You guys are the best! The BEST!