deeptools / HiCExplorer

HiCExplorer is a powerful and easy to use set of tools to process, normalize and visualize Hi-C data.
https://hicexplorer.readthedocs.org
GNU General Public License v3.0
233 stars 70 forks source link

Option to generate just one eigenvector and, separately, select which one you want #669

Closed kalavattam closed 3 years ago

kalavattam commented 3 years ago

Hi all,

I'd like to offer another feature request—of course, feel free take it or leave it.

If a user wants to generate only a single EV, then I think it could be useful to have an option for the user to select the EV they want. Unless I've missed something, it seems that, when using hicPCA, the only way to get EVs after EV1 is to generate multiple EVs sequentially starting from EV1.

Here's some reasoning for this request: Working with, for example, matrices from DNase- and MNase-based Hi-C (or Hi-C-like) protocols means that chromatin interaction info is—in comparison to "more traditional" restriction-enzyme Hi-C—biased towards shorter-range chromatin contacts versus longer-range chromatin contacts (some unpublished stuff and this here; some other papers too that escape my mind right now). So, even when using 100-kb or 250-kb bins (so-called "coarse" resolutions), the information for longer-range contacts can be relatively sparse, and thus EV1 may not capture "compartment" data structure as has come to be assumed because most Hi-C experiments. Depending on the resolution, I am consistently capturing chromosome compartment info in certain EVs (e.g., EV2 or EV3), so it'd be good to output that specific EV instead of multiple EVs.

I guess, for example, something like this:

hicPCA \
--matrix "some.cool" \
--numberOfEigenvectors "1"
--whichEigenvector "2" \
--chromosomes "1" \
--pearsonMatrix "some_Pearson.cool" \
--format "bigwig" \
--outputFileName "some_EV2.bw"

Thanks, Kris