de-Boer-Lab / MAUDE

Mean Alterations Using Discrete Expression
MIT License
12 stars 4 forks source link

2-reporter screen #7

Open kiddo18 opened 1 year ago

kiddo18 commented 1 year ago

Hi Dr De Boer,

I know MAUDE was designed for single-reporter screen (eg GFP-high vs GFP-low), but just wondering if I can use MAUDE for two-reporter screen (eg GFP-high/BFP-high vs. GFP-low/BFP-low) as well?

If yes, should I make any change to the inputs as indicated in your paper?


Users provide a data.frame, with columns containing the bin counts (one column per bin, plus one for the unsorted cells), as well as columns annotating the data included in each row, including guide ID, experimental identifiers (e.g., replicate, condition, etc.), whether or not the guide is a negative control guide, and any other guide-associated data (e.g., genomic locus). Users also provide a data.frame containing the bin sizes, with one row per bin per experiment, and columns corresponding to the Z-score bounds of each bin (“binStartZ” and “binEndZ”) and the corresponding bin cumulative distribution function percentiles (“binStartQ” and “binEndQ”). Using MAUDE’s “findGuideHitsAllScreens” function, and providing the experimental design data.frame, read count data.frame, and bin bound data.frame
Carldeboer commented 1 year ago

MAUDE assumes the data are approximately normally distributed. That said, it seems to work pretty well even when it is not. Your case is a little weird because you're taking two corners of a distribution and there is no way you can rotate the data to make the ++ and -- bins correspond to tails of the distribution. It will probably still work, but the effect sizes will be off (their relative values should still be meaningful). I think the statistics (P-values, FDR, stoufferZ) will all be fine since they are calibrated with the negative control guides. I would be interested to know if it works for you.

kiddo18 commented 1 year ago

Ah interesting point

I just wonder if MAUDE can somehow be adapted such that it assumes a 2D Gaussian (or even n-d Gaussian) instead

But thank you for your input! I'll try it out and follow up later

Carldeboer commented 1 year ago

Had you done 4 bins (e.g. GFP high and low, and separately BFP high and low), you could analyze them separately and then find elements that are high in both or low in both (also high in one and low in the other, etc). With only ++ and --, I don't think you can estimate the separate effects on GFP and BFP (as in a 2D gaussian). With all 4 corners there is probably a way, and that would probably be best (with 4 bins), but MAUDE would require modification for this use case. Best of luck!