Closed jfear closed 6 years ago
The paper says there are ~300 samples and cites the following GEO entries.
GSE23537, GSE15292, GSE20000, GSE16245, GSE25955, GSE25964, GSE25956, GSE25957, GSE25958, GSE25959, GSE25960, GSE25961, GSE25962, GSE25963
We could pull data from here too if need be.
Story
Yijie needs a weight matrix for ChIP and Histone data. modENCODE has the largest source of this type of data for D. mel. We need to identify all of the datasets and use them to build several weight matrices. Ideally we will generate a conservative matrix and a more relaxed matrix.
Questions and Tasks
Definition of done
Deliver two weight matrices g x tf that has a binary indicator of if a TF had a peak in the gene.