j-andrews7 / VAMPIRE

Variant and Epigenetic anNotation for Underlying Significance and Regulation
MIT License
3 stars 0 forks source link

motifs.py bugs, tests & refactors #31

Open j-andrews7 opened 8 years ago

j-andrews7 commented 8 years ago

Bugs

chr1    12233913    12234377    TARDBP  CPEB1,0.4707,1.7086;BCL11A,9.8773,10.9563;EHF,9.3166,8.6945;ELF2,5.8123,6.5737;ETS2,7.5269,5.4783;IRF1,8.2849,4.4558;IRF3,6.421,2.9616;IRF4,9.367,9.1753;IRF5,7.7549,2.8007;IRF8,5.7816,4.6789;MAZ,3.8977,0.595;MNT,11.4016,9.5642;NR2F6,10.2863,10.2863;PRDM1,4.3469,-1.0544;RXRA,6.571,5.0389;SP1,5.3478,4.1506;SPI1,4.0425,4.6874;SPIB,9.6535,10.8093;SPIC,10.2752,-0.4592;WT1,6.6021,2.7608;ZNF148,6.662,4.7838;ZNF713,16.4239,16.4239
chr1    12336347    12336707    CTCF    BHLHE41,5.7013,6.0162;NHLH1,-5.6823,3.2276;MYOG,1.5368,5.9291;BCL11A,3.7113,6.9359;CTCFL,6.5574,9.8007;CTCF,7.8951,11.7162;EOMES,4.1982,7.1215;ETS2,5.962,7.5233;NHLH1,-3.3856,5.7203;IRF4,5.4594,9.1875;MYOG,3.3927,5.6769;SP3,5.7795,6.7673
chr1    14028964    14029428    TARDBP  KLF3,6.3293,4.8345;POU5F1B,2.0157,5.664;PBX1,5.5521,7.1685;POU2F2,1.6477,4.8154;POU3F4,3.4838,5.1457;RFX2,2.0759,-0.4307;RFX5,6.2369,7.0469;ZBTB7B,2.6463,0.8758;NANOG,4.8166,9.916;NR1I3,8.1888,1.1553;PAX1,4.1869,5.4813;PAX5,6.9013,6.7399;POU5F1,3.471,10.7001;SOX2,-0.9456,6.7054;SOX4,0.0444,6.1941;ZNF282,7.113,3.5338
chr1    14029021    14029705    PML KLF3,6.3293,4.8345;POU5F1B,2.0157,5.664;PBX1,5.5521,7.1685;POU2F2,1.6477,4.8154;POU3F4,3.4838,5.1457;RFX2,2.0759,-0.4307;RFX5,6.2369,7.0469;ZBTB7B,2.6463,0.8758;NANOG,4.8166,9.916;NR1I3,8.1888,1.1553;PAX1,4.1869,5.4813;PAX5,6.9013,6.7399;POU5F1,3.471,10.7001;SOX2,-0.9456,6.7054;SOX4,0.0444,6.1941;ZNF282,7.113,3.5338
chr1    14029401    14029592    PAX5    KLF3,6.3293,4.8345;POU5F1B,2.0157,5.664;PBX1,5.5521,7.1685;POU2F2,1.6477,4.8154;POU3F4,3.4838,5.1457;RFX2,2.0759,-0.4307;RFX5,6.2369,7.0469;ZBTB7B,2.6463,0.8758;NANOG,4.8166,9.916;NR1I3,8.1888,1.1553;PAX1,4.1869,5.4813;PAX5,6.9013,6.7399;POU5F1,3.471,10.7001;SOX2,-0.9456,6.7054;SOX4,0.0444,6.1941;ZNF282,7.113,3.5338
chr1    15738061    15738597    SIN3A   TFAP2D,7.0767,5.5819;EBF1,6.1442,5.687;ESRRA,7.9735,9.7136;ESR1,6.8764,5.9683;ESR1,7.5544,5.2967;ZFX,1.4692,5.4618;TFAP2A,6.9283,5.922;ESR2,6.6741,4.299;NR3C2,6.4958,6.8992;NR1H4,12.9202,4.1355;NR1I2,10.0822,1.2976;NR1I3,10.6472,2.2765;NR5A2,7.9451,-0.8719;NR6A1,4.5753,-4.3642;PURA,7.1546,8.0875;RARB,7.24,-0.4811;RARG,8.0175,-0.0647;RORC,5.3152,-0.8346;VDR,9.3969,4.2596

The way this is currently implemented is, frankly, just mindbogglingly over complex. Should be pretty straight forward, no need for the mess it is now.

Crumbs350 commented 7 years ago

I've made significant changes to motifs.py since many of these complaints were logged. Which are still valid?

j-andrews7 commented 7 years ago

I haven't actually run the code, so it's tough to say. Removing the OptionsList class is probably a good idea for clarity's sake, since it's currently pulling arguments both directly from args and that. The global variables need to be wrapped up, and the code is fairly messy currently in terms of unused variables, old comments, etc.

I know we haven't really implemented any sort of stats to determine significant changes, which would be helpful in weeding out matches, as noted in the first refactor point. I assume you've handled the multiallelic calls in some fashion? If so, that could probably be checked off. I also don't know if you've ran it with the ChIP peaks file or with a motif list with thresholds defined. Probably worthwhile to test those, as they relate to the first bug listed and are just generally important.

Sorry for my lack of contributions, I have a grant due on the 5th and am devoting most of my time to generating data/writing that.

Crumbs350 commented 7 years ago

I had not looked at the bug tracker in awhile. That helps, I wasn't sure what had been marked off for that bug. My perspective.

I need to go back through emails and mark off all the questions I asked to make sure those tasks have been done.

Bill

On 11/15/2016 10:13 PM, Jared Andrews wrote:

I haven't actually run the code, so it's tough to say. Removing the OptionsList class is probably a good idea for clarity's sake, since it's currently pulling arguments both directly from args and that. The global variables need to be wrapped up, and the code is fairly messy currently in terms of unused variables, old comments, etc.

I know we haven't really implemented any sort of stats to determine significant changes, which would be helpful in weeding out matches, as noted in the first refactor point. I assume you've handled the multiallelic calls in some fashion? If so, that could probably be checked off. I also don't know if you've ran it with the ChIP peaks file or with a motif list with thresholds defined. Probably worthwhile to test those, as they relate to the first bug listed and are just generally important.

Sorry for my lack of contributions, I have a grant due on the 5th and am devoting most of my time to generating data/writing that.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/j-andrews7/VENUSAR/issues/31#issuecomment-260849474, or mute the thread https://github.com/notifications/unsubscribe-auth/AVelDn00hz61Z3AJvkFXsvnVbPqkTXTfks5q-oLagaJpZM4KVE2k.