Open tfindley15 opened 1 year ago
Hi Reese,
We have a different format for metadata for cpg0004 and cpg0016.
For cpg0004, the metadata files are in the lincs-cell-painting repo. If you have questions about the metadata, please feel free to create an issue in that repo.
For cpg0016, the metadata files are in this repo. Instead of the typical plate map and metadata files, we now have three types of files that contain all the metadata information:
plate.csv.gz
: contains the plate name to type of perturbation mapping.
well.csv.gz
: contains the plate name, well name to the perturbation ID mapping
compound.csv.gz
, orf.csv.gz
and crispr.csv.gz
: contain the perturbation ID to other external metadata mapping.
Please let us know if you have other questions!
@tfindley15
I'm moving your Q from https://github.com/broadinstitute/cellpainting-gallery/issues/42 to here
My name is Reese and I am a data scientist at ViQi Inc. working to do AI analysis on the jump datasets. I am currently looking at the jump pilot data and I had some questions regarding the JUMP-Target-1_compound_metadata_targets.tsv. What specifically do the genes in the target and target_list columns in this file represent? What source did it come from (pubchem, etc.)? Thank you so much for your time and continued efforts!
All these annotations come from https://s3.amazonaws.com/data.clue.io/repurposing/downloads/repurposing_drugs_20200324.txt
Read more about that resource here: https://clue.io/repurposing#about
You might find this preprint useful to read: https://www.biorxiv.org/content/10.1101/2022.01.05.475090v1
Hi!
Our team at ViQi Inc. have been doing ML analysis on the cpg0012 dataset with a lot of success. We would now like to move on to the jump dataset cpg0016 and possibly cpg0004. However, I cannot find the metadata folders inside of 'workspace' for either of these datasets for any sources. Is the compound and dose information for these datasets elsewhere?
Thank you for your time and efforts!
Cheers, Reese