broadinstitute / lincs-cell-painting

Processed Cell Painting Data for the LINCS Drug Repurposing Project
BSD 3-Clause "New" or "Revised" License
25 stars 13 forks source link

Get per-plate evaluation metrics #52

Open gwaybio opened 4 years ago

gwaybio commented 4 years ago

use the cytominer-eval library. An example https://github.com/jump-cellpainting/develop-computational-pipeline/issues/4#issuecomment-693006903 is pasted below:

After installing with:

pip install git+https://github.com/cytomining/cytominer-eval@56bd9e545d4ce5dea8c2d3897024a4eb241d06db

This now works:

import pandas as pd
from cytominer_eval import evaluate
from pycytominer.cyto_utils import infer_cp_features

file = "https://github.com/broadinstitute/lincs-cell-painting/raw/master/profiles/2016_04_01_a549_48hr_batch1/SQ00014813/SQ00014813_normalized_feature_select_dmso.csv.gz"
df = pd.read_csv(file)

features = infer_cp_features(df)
meta_features = infer_cp_features(df, metadata=True)

replicate_groups = ["Metadata_broad_sample", "Metadata_mg_per_ml"]

evaluate(
    profiles=df,
    features=features,
    meta_features=meta_features,
    replicate_groups=replicate_groups,
    operation="percent_strong",
    percent_strong_quantile=0.95
)

# Output: 0.32598039215686275

operation="grit" and operation="precision_recall" are also implemented.

(see https://github.com/cytomining/cytominer-eval/blob/master/cytominer_eval/evaluate.py for details)