carpenter-singh-lab / 2022_Haghighi_NatureMethods

High-Dimensional Gene Expression and Morphology Profiles of Cells across 28,000 Genetic and Chemical Perturbations
BSD 3-Clause "New" or "Revised" License
47 stars 10 forks source link

Deposit single cell profiles in S3 #1

Open shntnu opened 3 years ago

shntnu commented 3 years ago

@shntnu will do

shntnu commented 2 years ago

First, move data to s3://cellpainting-gallery

aws s3 sync \
  --profile  jump-cp-role  \
  --acl bucket-owner-full-control \
  s3://cellpainting-datasets/Rosetta-GE-CP/preprocessed_data \
  s3://cellpainting-gallery/rosetta/broad/workspace/preprocessed_data
shntnu commented 2 years ago

@MarziehHaghighi I have successfully copied the files over from s3://cellpainting-datasets/Rosetta-GE-CP/preprocessed_data. You can now delete the files at s3://cellpainting-datasets/Rosetta-GE-CP/preprocessed_data.

Note that the topic of this issue is about single cell data, but I have not gotten to that yet. This is only about copying the existing data to the s3://cellpainting-gallery/

MarziehHaghighi commented 2 years ago

@shntnu How can I have access to s3://cellpainting-gallery for adding DVC and also any other modifications?

shntnu commented 2 years ago

@shntnu How can I have access to s3://cellpainting-gallery for adding DVC and also any other modifications?

I should have clarified

  1. We won't use DVC after all; just store directly on S3
  2. For now, the best approach to update any data on s3://cellpainting-gallery is to have you upload to s3://cellpainting-datasets/ and then ask me (in this issue) to sync it. It needs more work on my front to make it possible for you to upload to s3://cellpainting-gallery/rosetta
shntnu commented 2 years ago

I did this

aws s3 sync \
  --profile  jump-cp-role  \
  --acl bucket-owner-full-control \
  s3://cellpainting-gallery/rosetta/ \
  s3://cellpainting-gallery/cpg0003-rosetta/

aws s3 rm \
  --profile  jump-cp-role  \
  --recursive \
  s3://cellpainting-gallery/rosetta/