gigascience / paper-bray2017

Source code from Bray et al., (2017) A dataset of images and morphological profiles of 30,000 small-molecule treatments using the Cell Painting assay.
35 stars 16 forks source link

Metadata #11

Closed ethancohen123 closed 2 years ago

ethancohen123 commented 2 years ago

Hi, How can we access the metadata per plates ? It supposed to be on some extension ( indicated in the webpage ) but I don’t see how can one download them, can you please explain the process here ? Thank you

only1chunts commented 2 years ago

All the data that we host is available from the dataset file store, you can access that directly here https://ftp.cngb.org/pub/gigadb/pub/10.5524/100001_101000/100351/ or via the dataset files tab http://dx.doi.org/10.5524/100351 (each file has a hyperlink on the filename to enable download as well as a download button in the right-hand column).

There is a file called "chemical_annotations.csv" which has the description "Table containing metadata for many of the compounds from Broad Institute’s Chemical Biology Informatics Platform (CBIP), including (where applicable) compound names, simplified molecular-input line-entry system annotations (SMILES), MLSMR sample identifiers, and PubChem compound identifiers (CID) and substance identifiers (SID). The latter two items are useful for querying the PubChem Compound Database (http://www.ncbi.nlm.nih.gov/pccompound)." Maybe that is the metadata you are looking for?

Or maybe its the files that are stored in the directory called "profiles.dir" ?https://ftp.cngb.org/pub/gigadb/pub/10.5524/100001_101000/100351/profiles.dir/

Please note, we only host the data, we have no expertise in it and had nothing to do with its generation, hence I dont understand which "metadata" you are looking for. If you require something that we do not have you will need to contact the dataset submitter using the contact submitter button on the dataset page.

ethancohen123 commented 2 years ago

@only1chunts

metadat

I'm mentionning those metadata and I guess it should be also on gigaDB . It is just that I can't find the way to dowload them.

Thanks for your help Ethan

pli888 commented 2 years ago

@ethancohen123 Try reaching out to the authors - they have been responsive in the past.

only1chunts commented 2 years ago

@ethancohen123 the files you are looking for are included in each of the plate tar-balls, e.g. Plate_26569.tar.gz (available from https://ftp.cngb.org/pub/gigadb/pub/10.5524/100001_101000/100351/Plate_26569.tar.gz ) contains the subdirectories listed in your screenshot: image

I hope that is what you are looking for?

ethancohen123 commented 2 years ago

Yes that is what I am looking for thanks ! So if I dowload the plate number x I should have some folder included right ?

only1chunts commented 2 years ago

correct. Each plate "file" e.g. https://ftp.cngb.org/pub/gigadb/pub/10.5524/100001_101000/100351/Plate_26569.tar.gz is infact an archive of multiple files. Once downloaded you will need to extract the tar-ball archive using a suitable tool. If you are familiar with command line then the tar -zxf is recommended. If you are not using the comand line then you may need to google the most appropriate method for your opperating system (in windows I use a tool called 7-Zip).

ethancohen123 commented 2 years ago

Everything works fine I've uploaded one as example all good , thank you !