Closed axiomcura closed 1 month ago
@axiomcura - integration tests failed, please fix. Looks like the tests expect a specific error message which also needs updating
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 94.72%. Comparing base (
c6ad2a0
) to head (5f4ac6e
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@gwaybio All tests have passed! Please review when you have a moment to confirm if it's safe to merge.
Thanks Erik! I see that the tests are passing now.
Re-reading the reviewer comment, it seemed that they have issues with the metadata parameter.
I'm thinking that in addition to clarifying the features
exception, we should also do something to the metadata
parameter here: https://github.com/cytomining/pycytominer/blob/main/pycytominer/cyto_utils/features.py#L113C1-L116C19
Perhaps we can do two more things in this PR to update error message in the metadata inference.
metadata_features > 0
metadata="infer"
actually expects/doesI'll also tag @d33bs - Dave, after Erik addresses these two points, perhaps you can take a final look to see if both Erik and I overlooked something in making this change. Thanks!
@gwaybio - This is certainly doable, but I have a few clarifying questions. Could you please explain why the exception condition needs to be metadata_features > 0
?
From my understanding, Pycytominer’s main functions default to metadata_features="infer"
, which always expects metadata features from CellProfiler. As a result, adding an exception for metadata_features > 0
in infer_cp_features()
seems to introduce a contradictory logic. My guess would have been metadata_features == 0
.
I might be misunderstanding something where the metadata_features > 0
exception should be implemented. I just want to make sure if I am thinking of this correctly before I implement! Thanks!
Great catch Erik! We should not check metadata features in the way I suggested above. We ask users to provide metadata features within each function (e.g., normalize()
) call.
Perhaps the only change should be whereever we see meta_features="infer"
in the docstring (e.g., here: https://github.com/cytomining/pycytominer/blob/main/pycytominer/normalize.py#L41)
We add some specification to clarify that the cases in which meta_features="infer"
works as expected is not only when metadata features are prefixed by Metadata_
but that this occurs when using CellProfiler-derived input.
I think this is a relatively minor and straightforward fix.
meta_features="infer"
is used.@gwaybio - I have updated the docstrings.
Ahh shucks... there's going to be a lot of merge conflicts between this branch and #396. I was going to finish that one off this weekend. Not sure which order we want to merge / resolve merge conflicts in.
Bummer! My sense is:
We'll likely do 1 and 2 before the weekend FWIW
Sounds good! Let's do that!
@d33bs - ready for you to take a final look, thanks!
@d33bs @gwaybio I’ve finished applying the requested changes. I’m currently waiting for the tests to pass, and once they do, it’s ready for merging. ( I am unable to merge)
merged!
Description
Thank you for your contribution to pycytominer! Please succinctly summarize your proposed change. What motivated you to make this change?
Summary
This PR addresses one of the reviewers’ comments:
" When I used the function pycytominer.normalize() on a dataset originating from my own work, I encountered an error message saying
“No CP features found. Are you sure this dataframe is from CellProfiler?”.
I could easily fix the problem because I know the properties of a CellProfiler dataset, and the error message was hinting at some “Metadata_” missing.
The fix was very easy, I just needed to add the prefix “Metadata_” to my metadata columns. My dataset comes from another commonly used proprietary software called Harmony. I understand that the authors are in favor of using open-source resources, but I think that the pycytominer tool could be useful in a much larger context, and take as input different types of datasets. In the readme file, that I followed to perform the normalization, I couldn’t find anywhere that the dataset has to come from CellProfiler. ..."
I replaced the ‘assert’ statement with an exception, as it’s considered bad practice to use assertions during runtime. Assertions can be disabled with the
-o
parameter, which means they aren’t reliable for handling errors in production code. Here are some useful links:Additionally, I updated the error message to clarify that pycytominer infers CellProfiler features by default and provided guidance on how users can manually specify their feature space when working with non-CellProfiler data.
Please also link to any relevant issues that your code is associated with.
This code is also an addition to #430
What is the nature of your change?
Checklist
Please ensure that all boxes are checked before indicating that a pull request is ready for review.