Open gwaybio opened 3 years ago
Thanks for setting up this issue. I agree with the motivation and the authorship, if we include this.
I can definitely give you that output. But it will take a bit longer to build that df. You proposed df also has a lot more information then needed for the simple histogram. But overall its best because I will want to put this onto cyto eval anyways.
We should also add the MOAs of both compounds to that df since that will allow for more MOA focussed plots instead of counting the compounds
Sounds good! I am aiming for a July 1 submission, so in order for us to include it, I will need it before then.
Two additional points that you might want to consider:
Thanks!
Yea good points! Not very familiar with test-driven development but I could try it :)
@gwaygenomics The data you linked up there is level3. I just require level5 data. I misspoke in the meeting!
Can you point me to which level5 I should be using
level 5 links: Cell Painting L1000
Thanks!
@michaelbornholdt produced the analysis here: https://github.com/broadinstitute/neural-profiling/issues/2#issuecomment-872433897
I will ingest the output files in this repo for visualization
@michaelbornholdt presented an analysis on querying "hits" during profiling checkin today.
I'd love to be able to include this analysis in the LINCS complementarity paper. Michael estimated that his analysis would take ~2 hours.
Specification
Input data
Output
I think I need to understand the "hit" analysis better. Given a compound, you're asking if you match another replicate as the top hit? So essentially the output is a ranked list of matches per compound and whether or not they map to the same category?
If so, then can you output the following data frame?
Let's iterate on the final output specifications if my understanding above is limited in any way.
A couple preliminary figures and statistics would also be helpful.
Motivation