[x] Run on whole dataset. If the accuracy is similar to the 4800 subset, then we should keep doing the ~4800 subset for optimization.
[ ] At this point we are isolating just the positive strand. It would be great if we could incorporate the negative strand. This is the most important data to add.
Thomas
In relation to the new program that Thomas built here, we need to add a few more features.
[x] make sure jesse's output matches thomas's input.
[ ]somehow make the program be able to decide which motifs in which frequencies.
[x] Create 1,000 sets with zld, eve, cad, bcd PWMs. The best would be with if these frequencies matched the scoring that is naturally occurring. Make the classification be 50/50.
[x] Lower priority, but in the future, incorporate neutral sequences as defined with siteout
[x] Finish full workflow with documentation to re-create this notebook.
[x] After which I will have you try a new experiment involving a new data format. Let me know when you finish!
Ciera
[ ] need evolutionary information added to the data in someway, which is basically in .tre format, but could be used in other formats. Ciera needs to put this data up!
[x] More TFBS to add: dorsal, twist, hb, kr, kni, gt. Ciera get Joanne to create.
[x] Find distributions of motif scores in sequences. Find old notebook from TFBS team.
Leftover
These are the tasks that we still need to think about. Up for grabs for anyone.
[ ] need evolutionary information added to the data in someway.
[x] More TFBS to add: dorsal, twist, hb, kr, kni, gt
-[ ] In the future, it would be great to continue the larger "word" dictionary trials - Thomas lead at this point.
[ ] Earlier we experimented with adding padding to the front. Does this make a difference with the new bidirectional Architecture?
[ ] What is the best way to save results of our accuracy, c-statistic, and FDR?
Adam
Thomas
In relation to the new program that Thomas built here, we need to add a few more features.
Jesse
Sean
Ciera
Leftover
These are the tasks that we still need to think about. Up for grabs for anyone.