broadinstitute / cmQTL

High-dimensional phenotyping to define the genetic basis of cellular morphology
BSD 3-Clause "New" or "Revised" License
6 stars 0 forks source link

Recreate variable selected profiles #26

Closed shntnu closed 4 years ago

shntnu commented 4 years ago

Variable selection of profiles is done per batch. Do variable selection across the whole dataset and create a single data frame of the whole experiment.

shntnu commented 4 years ago

@jatinarora-upmc Above, per batch refers to groups (pairs, more specifically) of plates that were processed together; the grouping info is here. Look under each folder for a file named barcode_platemap.csv which has the mapping per batch.

shntnu commented 4 years ago

@jatinarora-upmc said:

I am on the data at single cell level. It seems the features are not normalized

That's correct – the single cell data in SQLite file have is not been normalized. We normalize at the aggregate level: https://github.com/broadinstitute/cmQTL/blob/beb198c9e111a4f1f27f08dd6154636dee8be5a0/1.profile-cell-lines/0.generate-profiles.sh#L126-L145

shntnu commented 4 years ago

Abandoned because all the analysis is being done at the single cell level