carpenter-singh-lab / 2022_Haghighi_NatureMethods

High-Dimensional Gene Expression and Morphology Profiles of Cells across 28,000 Genetic and Chemical Perturbations
BSD 3-Clause "New" or "Revised" License
47 stars 9 forks source link

Preprocess Datasets L1000 MAT file #17

Open swaraj84 opened 9 months ago

swaraj84 commented 9 months ago

@MarziehHaghighi

Could you please help with a portion of the 0-preprocess-datasets.ipynb, specifically in the following lines

from scipy.io import loadmat x = loadmat(cdrp_dataDir+'cdrp.all.prof.mat')

Where the source MAT file is mentioned to be generated using https://github.com/broadinstitute/2014_wawer_pnas, which is currently not available.

Any pointers towards the source file and the operations to be performed on it would be a good head start for me.

Thank you! Swaraj