jump-cellpainting / datasets

Images and other data from the JUMP Cell Painting Consortium
BSD 3-Clause "New" or "Revised" License
149 stars 13 forks source link

Provide prefix sizes #81

Open ErinWeisbart opened 10 months ago

ErinWeisbart commented 10 months ago

I think it would be helpful to provide a breakdown of data sizes by source/numerical data/image data so people have an idea of what they're getting into before downloading without having to list the bucket themselves.

I'm not sure how much is still in flux, but our dashboard auto-calculated these prefixes current as of right now. I'm happy to flesh out/update.

source images size (TB) workspace size (TB) workspace_dl size (TB) total size (TB)
1 13.2
2 7.6 10.8 21.6
3 16.6 20.6 42.5
4 17.6 17.3 39.1
5 13.1 32 7.4
6 11.7 25.8 43.7
7 14.9
8 7.2 12.1 24.4
9 9.2 17.8 7.1
10 7.5 11.3 21.6
11 10.3 21.6
13 15.8 6.8
ErinWeisbart commented 10 months ago

(This is what's in cellpainting-gallery/cpg00016-jump) I'm planning on providing the total size in the cellpainting-gallery README but I think a by-source breakdown belongs in this repo.

ErinWeisbart commented 5 months ago

FYI when you're ready to add this to a new data release, these can now be quickly and easily calculated with https://github.com/broadinstitute/cpg