In this project, we aim to use Human U2OS cell images (GigaScience dataset) to predict a large number of compound activities against different protein targets.
Investigations and key findings:
To learn more, please check out our Jupyter Notebooks below and Python scripts in ./scripts
.
Notebook | Description |
---|---|
image_processing.ipynb |
Visualize the raw images and their features |
meta_data.ipynb |
Explore the meta data come with the image dataset, such as compound chemical annotations |
feature_visualization.ipynb |
Visualize the single cell images, CNN extracted features, and clusterings on the extracted features |
normalization.ipynb |
Experiment with batch normalization methods such as Combat and z-score normalization |
explore_excape_db.ipynb |
Align U2OS image data with ExCAPE-DB assay data using chemical annotations |
positive_control.ipynb |
Find compounds that have been tested on U2OS cell-line from the CCLE database. |
assay_selection.ipynb |
Aggregate cell-level CellProfiler features to assay-level |
assay_prediction.ipynb |
Predict assay activity using U2OS images with random forest and logistic regression models |
simple_cnn.ipynb |
Predict assay activity using U2OS images by training a LeNet CNN model |