broadinstitute / lincs-cell-painting

Processed Cell Painting Data for the LINCS Drug Repurposing Project
BSD 3-Clause "New" or "Revised" License
25 stars 13 forks source link

Frozen data version 1 #63

Closed gwaybio closed 3 years ago

gwaybio commented 3 years ago

I update pycytominer and add associated fixes as described in #62

TODO

In the next PR, I will migrate from git lfs to dvc

gwaybio commented 3 years ago

@shntnu - this is good to go. Sorry for the HUGE amounts of files (most are just profiles).

Please pay extra attention to any updated documentation. Any code changes will require a complete rerun (which I'll only do if absolutely necessary). If necessary, I can address #65 simultaneously.

The next step will be to update to dvc!

shntnu commented 3 years ago

Please pay extra attention to any updated documentation.

I focused on only .py and .md

I didn't see any documentation changes other than the README.md

Did I miss any documentation?

Any code changes will require a complete rerun (which I'll only do if absolutely necessary). If necessary, I can address #65 simultaneously.

No need to do #65 except perhaps this thing you suggested:

I can add a prominent note to make sure these are dropped in all downstream analyses in a README in #63

gwaybio commented 3 years ago

Did I miss any documentation?

Nope, I think you got it all. Thank you!

I can make all of these changes, and we should be good to merge soon

gwaybio commented 3 years ago

Alright @shntnu - this is ready for your eyes again. Here is what changed:

gwaybio commented 3 years ago

Do you want to update this as well? https://figshare.com/articles/dataset/Blacklist_Features_-_Cell_Profiler/10255811

I don't think so... Although i do think that we want to update this figshare document to include other version-specific CellProfiler blocklists. At the very least, much more thought needs to go into updating it (much more thought for me at least!). As a separate but related note: I really want to do a deep dive into CellProfiler features... i think its the first step to understanding generic morphology features, which we'll want to annotate with more interpretable biology. It'll also help us with interpreting DeepProfiler features in the future.

shntnu commented 3 years ago

As a separate but related note: I really want to do a deep dive into CellProfiler features... i think its the first step to understanding generic morphology features, which we'll want to annotate with more interpretable biology. It'll also help us with interpreting DeepProfiler features in the future.

There are two resources that I can think of that will be relevant for this effort

  1. A well-documented readme https://github.com/carpenterlab/2016_bray_natprot/wiki/What-do-Cell-Painting-features-mean%3F
  2. An incomplete notebook https://github.com/cytomining/cytominergallery/blob/105be87878d13024ef283e284cf085351b498883/notebooks/empty_readouts.Rmd (knit here https://rpubs.com/shantanu/cp_feature_stats)