lzamparo / embedding

Learning semantic embeddings for TF binding preferences directly from sequence
Other
0 stars 0 forks source link

Model visualization enhancements needed #13

Open lzamparo opened 7 years ago

lzamparo commented 7 years ago

To interpret the learned codes it would help to have the following visualizations:

  1. distance matrix clustering for all TFs. Do probes from like families cluster together?
  2. Visualization in 3D for probes from 3 specific factors (do we see separation??) Maybe from distinct families...
  3. Look at nearest k-mers to center of mass for each factor: K-mers which are within a very small radius of the centre of mass of each factor.

Maybe something like exemplar-based clustering could work in the embedding space?

Further down the line, for a given probe, can we decode along a probe to find important Kmers (that might resemble motifs??)

lzamparo commented 6 years ago

Now that I'm primarily working with embedding ATAC-seq peaks, tests (2) is not so useful, but they can perhaps be re-imagined as subpeaks within different types of elements. Possibly also between similar elements of similar celltypes for the Corces data set?