AllenInstitute / AllenSDK

code for reading and processing Allen Institute for Brain Science data
https://allensdk.readthedocs.io/en/latest/
Other
344 stars 149 forks source link

Create GBM example notebook #1648

Closed wbwakeman closed 3 years ago

wbwakeman commented 4 years ago

Create a Jupyter notebook with data access and display examples for the Ivy Glioblastoma Atlas data set on AWS. Repo: https://github.com/AllenInstitute/open_dataset_tools JPEG image and JSON metadata files are publicly available in this bucket: s3://allen-ivy-glioblastoma-atlas/ https://console.aws.amazon.com/s3/buckets/allen-ivy-glioblastoma-atlas/

INTRODUCTION The Ivy Glioblastoma Atlas Project is a collection of data from glioblastoma brain tumors. The project is a collaboration between the Allen Institute for Brain Science and the Ben and Catherine Ivy Foundation. Glioblastoma is an aggressive brain cancer. Survival after diagnosis is just 12 to 15 months. An interactive atlas is available at https://glioblastoma.alleninstitute.org/ and consists of: • Images of in situ hybridization experiments identifying where key genes are expressed in the tumor tissue (20x magnification) • Matching histology images using Hematoxylin and Eosin (H&E) stain • Gene expression masks for all ISH images • Annotated mask images of tumor structures for all ISH and H&E images • RNA sequencing data for 270 samples from 44 tumors • Companion clinical database hosted at https://ivygap.swedish.org (registration required) • Accompanying MRI/CT scan data for the patients is available at the Cancer Imaging Archive .

The project has been published in the May 11, 2018 edition of Science.

DATA SECTION The image data for the project is being made available as an AWS public dataset to enable computational scientists easy access to a rich, well-annotated data set for training and validation of machine learning, classification, and developing computer vision applications such as image segmentation. The data are publicly available from the s3://allen-ivy-glioblastoma-atlas/ bucket. There is a directory for each donor patient. These directories contain a directory for each tissue sample from the donor. Each directory also contains a JSON file with relevant and useful metadata.

DONOR SECTION -blurb-

SPECIMEN SECTION -blurb-

SECTION DATA SETS -blurb-

IMAGES -blurb-

wbwakeman commented 3 years ago

After feedback from AWS contact, need to make it clear that the RNASeq data are NOT available from AWS bucket at this time. My proposal for revised Introduction section:

INTRODUCTION The Ivy Glioblastoma Atlas Project is a collection of data from glioblastoma brain tumors. The project is a collaboration between the Allen Institute for Brain Science and the Ben and Catherine Ivy Foundation. Glioblastoma is an aggressive brain cancer. Survival after diagnosis is just 12 to 15 months. An interactive atlas is available at https://glioblastoma.alleninstitute.org/ and consists of: • Images of in situ hybridization experiments identifying where key genes are expressed in the tumor tissue (20x magnification) • Matching histology images using Hematoxylin and Eosin (H&E) stain • Gene expression masks for all ISH images • Annotated mask images of tumor structures for all ISH and H&E images

The project also comprises additional data modalities that are not currently available from the AWS bucket. These are: • RNA sequencing data for 270 samples from 44 tumors • Companion clinical database hosted at https://ivygap.org (registration required) • Accompanying MRI/CT scan data for the patients is available at The Cancer Imaging Archive .

The project has been published in the May 11, 2018 edition of Science.