Closed jenna-tomkinson closed 1 year ago
@gwaybio
This separation (also seen in cell-health) is due to the Cellpose center x,y coords for each object is different. As well, since DeepProfiler creates a box around these coords, the parameters for the box size is different between the objects. This is what leads us to creating a DeepProfiler project for each object.
As for making them a single dataset, we could do a merging function (like what is being made in idrstream) to merge the features from each project to the single cells based on the center x,y coordinates.
What are your thoughts on this?
right! Glad to know the merging function is going to be necessary in multiple projects - we're definitely going to want modularity!
Our univariate tests should be fine with this split, but our multivariate tests will be impacted.
Can you create an issue describing the need to link them, so that we don't forget in the future?
For now, we can perform the univariate tests separately and apply the multivariate tests on each individual data set (nuc and cyto) separately. We'll perform a combined multivariate test once we have a linking function.
@gwaybio
Sounds good! I will make that issue now! Are we okay to merge this PR through after that change to feature selection?
@gwaybio @d33bs
Here is the PR for normalizing and feature selection of DeepProfiler features.
Please feel free to review this and I will perform analysis after :smile: