czbiohub-sf / tabula-muris

Code and annotations for the Tabula Muris single-cell transcriptomic dataset.
https://www.nature.com/articles/s41586-018-0590-4
BSD 3-Clause "New" or "Revised" License
187 stars 91 forks source link

Download data from aws s3 #245

Closed pocession closed 2 years ago

pocession commented 2 years ago

Dear people in Tabula muris,

Thank you for your contribution in generating this beautiful data set.

I am trying to download data with the following command: aws s3 cp s3://czbiohub-tabula-muris/TM_droplet_mat.h5ad .

However, I encounter the following error: warning: Skipping file s3://czbiohub-tabula-muris/TM_droplet_mat.h5ad. Object is of storage class GLACIER. Unable to perform download operations on GLACIER objects. You must restore the object to be able to perform the operation. See aws s3 download help for additional parameter options to ignore or force these transfers.

I then tried to download data with the following command (plus --storage-class STANDARD --force-glacier-transfer): aws s3 cp s3://czbiohub-tabula-muris/TM_droplet_mat.h5ad . --storage-class STANDARD --force-glacier-transfer

But I still have this error: download failed: s3://czbiohub-tabula-muris/TM_droplet_mat.h5ad to ./TM_droplet_mat.h5ad An error occurred (InvalidObjectState) when calling the GetObject operation: The operation is not valid for the object's storage class

Could you provide some command lines for downloading data from you S3 buckets?

Best,

Tsunghan Hsieh (post doc in Riken)

aopisco commented 2 years ago

Hi @pocession, the data is located here -- if you update your paths that should work!

pocession commented 2 years ago

Thanks. It works now.

warrenalphonso commented 2 years ago

Hey, sorry if this is a silly question but isn't the link you posted to tabula-muris-senis and not tabula-muris data? I wanted to check out the data for the original tabula muris paper, but I'm getting this and some other S3 errors while attempting to download it. For example,

$ aws s3 cp s3://czb-tabula-muris/TM_droplet_metadata.csv .
fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden

aws s3 ls works though. I'm not so familiar with S3, but it looks like some files have been compressed via Glacier and others we just don't have access to.

pocession commented 2 years ago

Hi, you are having same issues as I had before. To get the files by using AWS CLI, you have to apply for an account from Amazon S3 and get the keys. Follow these steps:

  1. Apply for Amazon S3 account (You need a credit card). You can follow Amazon S3's manual.
  2. Get the Amazon S3's access key ID and secret access key. You can follow AWS Account and Access Keys
  3. Install Amazon S3 command line interface.
  4. In your terminal, type aws configure.
  5. Type your Amazon S3's access key ID and secret access key.
  6. Go to Tabula muris S3 storage.
  7. Copy the S3 URL of the data you want.
  8. In your terminal, type aws s3 cp S3-URL your-destination-path

    It seems more than three people (plus you and me) confused the downloading URL and the usage of Amazon S3. You may consider to open a new issue so that Tabula muris people may notice this.

Best

warrenalphonso commented 2 years ago

Thanks for the reply @pocession! You linked to tabula-muris-senis, which works for me. But tabula-muris doesn't work. I had configured the CLI already, like you mentioned. And the link I posted above is the S3 URI. For example, if you click on TM_droplet_metadata.csv here and then copy S3 URI, you'd get the same URI I posted above:

$ aws s3 cp s3://czb-tabula-muris/TM_droplet_metadata.csv .
fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden

Could you please try this specific file and let me know if it works for you?

warrenalphonso commented 2 years ago

Maybe @aopisco, can you help?

pocession commented 2 years ago

@warrenalphonso No, I also could not download from czb-tabula-muris. But I guess the czb-tabula-muris-senis is the official repository?

warrenalphonso commented 2 years ago

Thanks for trying! I thought tabula-muris and tabula-muris-senis were two different datasets, since they're released in two separate papers around two years apart.

pocession commented 2 years ago

Ah I see. Maybe tabula-muris-senis is for single cell RNAseq and tabula-muris is for bulk RNAseq.

Get Outlook for iOShttps://aka.ms/o0ukef


From: Warren Alphonso @.> Sent: Wednesday, August 31, 2022 1:52:00 AM To: czbiohub/tabula-muris @.> Cc: Tsung-Han Hsieh @.>; Mention @.> Subject: Re: [czbiohub/tabula-muris] Download data from aws s3 (Issue #245)

Thanks for trying! I thought tabula-muris and tabula-muris-senis were two different datasets, since they're released in two separate papers around two years apart.

— Reply to this email directly, view it on GitHubhttps://github.com/czbiohub/tabula-muris/issues/245#issuecomment-1231920712, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGEW44BSY6LUWR5FI53VACLV3Y33BANCNFSM5677S6NQ. You are receiving this because you were mentioned.Message ID: @.***>

aopisco commented 2 years ago

Hi all, Tabula Muris Senis contains Tabula Muris. Both datasets are single cell. Tabula Muris is the 3 month timepoint. Tabula Muris Senis is 1 month, 3 months (Tabula Muris), 18 months, 21 months, 24 months, and 30 months. The reason why I sent the instructions for Tabula Muris Senis is because that dataset is more accessible and it does contain the Tabula Muris. I hope this helps.

pocession commented 2 years ago

Perfect, Thanks for the clarification!

From: aopisco @.> Date: Wednesday, August 31, 2022 9:59 To: czbiohub/tabula-muris @.> Cc: Tsung-Han Hsieh @.>, Mention @.> Subject: Re: [czbiohub/tabula-muris] Download data from aws s3 (Issue #245)

Hi all, Tabula Muris Senis contains Tabula Muris. Both datasets are single cell. Tabula Muris is the 3 month timepoint. Tabula Muris Senis is 1 month, 3 months (Tabula Muris), 18 months, 21 months, 24 months, and 30 months. The reason why I sent the instructions for Tabula Muris Senis is because that dataset is more accessible and it does contain the Tabula Muris. I hope this helps.

— Reply to this email directly, view it on GitHubhttps://github.com/czbiohub/tabula-muris/issues/245#issuecomment-1232327452, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGEW44CIUU7XAQUNZDAKKI3V32U53ANCNFSM5677S6NQ. You are receiving this because you were mentioned.Message ID: @.***>

warrenalphonso commented 2 years ago

Thanks for clarifying!