broadinstitute / cellpainting-gallery

Cell Painting Gallery
https://broadinstitute.github.io/cellpainting-gallery/
MIT License
51 stars 8 forks source link

Allow `profile` data duplication? #72

Open ErinWeisbart opened 10 months ago

ErinWeisbart commented 10 months ago

Shantanu noted here:

For example, s3://cellpainting-gallery/cpg0001-cellpainting-protocol/workspace/profiles is redundant with https://github.com/jump-cellpainting/pilot-data-public/tree/main/profiles.

Eventually, we'd want to get rid of that redundancy (in favor of the latter)

ErinWeisbart commented 10 months ago

I'd provide the counter argument that the cellpainting-gallery should provide data from raw images through profiles for all datasets we host. I think it's confusing to have some datasets have profiles on S3 and some where you need to instead go to a GitHub repository. Doing so decreases the utility of our resource.

I think we should make a public note that profiles are the most likely data to change and anyone using them should check linked external repositories for updates.