WayScience / Benchmarking_NF1_data

Benchmarking data processing strategies for Cell Painting data of NF1 Schwann cells. See analysis repository (https://github.com/WayScience/NF1_SchwannCell_data_analysis) for information on how the data was interpreted.
Creative Commons Zero v1.0 Universal
6 stars 18 forks source link

deepprofiler project processing features #23

Closed jenna-tomkinson closed 1 year ago

jenna-tomkinson commented 1 year ago

@gwaybio @d33bs

Here is the PR for normalizing and feature selection of DeepProfiler features.

Please feel free to review this and I will perform analysis after :smile:

jenna-tomkinson commented 1 year ago

@gwaybio

This separation (also seen in cell-health) is due to the Cellpose center x,y coords for each object is different. As well, since DeepProfiler creates a box around these coords, the parameters for the box size is different between the objects. This is what leads us to creating a DeepProfiler project for each object.

As for making them a single dataset, we could do a merging function (like what is being made in idrstream) to merge the features from each project to the single cells based on the center x,y coordinates.

What are your thoughts on this?

gwaybio commented 1 year ago

right! Glad to know the merging function is going to be necessary in multiple projects - we're definitely going to want modularity!

Our univariate tests should be fine with this split, but our multivariate tests will be impacted.

Can you create an issue describing the need to link them, so that we don't forget in the future?

For now, we can perform the univariate tests separately and apply the multivariate tests on each individual data set (nuc and cyto) separately. We'll perform a combined multivariate test once we have a linking function.

jenna-tomkinson commented 1 year ago

@gwaybio

Sounds good! I will make that issue now! Are we okay to merge this PR through after that change to feature selection?