Code to generate UMAP coordinates and figure in partial response to reviewer comment:
Reviewer 2 - Major Comment 1:
A fascinating bit of the manuscript is the description of the feature selection from the screen is done systematically, considering the technical and biological variability and technical artifacts and modeling covariates using linear models seems a very appropriate way of doing so and could serve as another proof of concept that this is indeed the most robust way of modeling and removing signal of technical covariates from the data. Yet, I wondered why the authors do not discuss other means of feature selection or dimensionality reduction; further, they need to show how the features cluster the cell lines or why impact (information content) different features deliver. For an audience interested in the technical aspects of cell painting analysis and machine learning based on the data, that would, IMHO, be the most exciting questions.
Code to generate UMAP coordinates and figure in partial response to reviewer comment: