genome-in-a-bottle / giab_data_indexes

This repository contains data indexes from NIST's Genome in a Bottle project.
238 stars 71 forks source link

Locations of Oxford Nanopore fast5 data? #8

Open oneillkza opened 4 years ago

oneillkza commented 4 years ago

I see you now have PromethION data for several of the individuals, available in fastq format. However, I don't see a link to the raw fast5. Is this available somewhere?

nate-d-olson commented 4 years ago

The raw fast5s are hosted on the Human Pangenome Projects S3 bucket. Let us know if you have any questions or issues accessing the files.

HG002: https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=HG002/nanopore/ HG003: https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=HG003/nanopore/ HG004: https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=HG004/nanopore/ HG005: https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=HG005/nanopore/ HG006: https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=HG006/nanopore/ HG007: https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=HG007/nanopore/

vahidAK commented 3 years ago

Hi @nate-d-olson and @jzook,

It seems that these links are not working anymore?

Do you have any idea where those files can be accessed?

Thanks, Vahid

nate-d-olson commented 3 years ago

Looks like they restructured the S3 bucket here are the updated links.

HG002: https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=NHGRI_UCSC_panel/HG002/nanopore/ HG003: https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=NHGRI_UCSC_panel/HG003/nanopore/ HG004: https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=NHGRI_UCSC_panel/HG004/nanopore/ HG005: https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=NHGRI_UCSC_panel/HG005/nanopore/ HG006: https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=NHGRI_UCSC_panel/HG006/nanopore/ HG007: https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=NHGRI_UCSC_panel/HG007/nanopore/

vahidAK commented 3 years ago

Thanks a lot

snajder-r commented 2 years ago

If I unpack GM24385_1.fast5.tar.gz, it contains the directory GM24185_1. I assume that's just a minor error in the naming of the folder, and the data is actually for GM24385?

nate-d-olson commented 2 years ago

@snajder-r Yes, I believe it is a typo, especially considering that I am unaware of the GM24185 cell line. I am confirming that this is a type with our collaborators at UCSC.

nate-d-olson commented 2 years ago

@snajder-r Our collaborators at UCSC confirmed that it is a typo. Thanks for using our data and letting us know about this typo. :)

gambalab commented 1 year ago

Hi, which nanopore chemistry was used for Ulra long sequencing with Promethion? R9.4.1 or R10.4.1?

nate-d-olson commented 1 year ago

All the USCS Ultralong promethion data currently on the ftp site were generated using R9.4.1.