cytomining / pycytominer

Python package for processing image-based profiling data
https://pycytominer.readthedocs.io
BSD 3-Clause "New" or "Revised" License
78 stars 35 forks source link

FeatureRequest: Automatically generated dataframe schemas to catch errors #384

Open kenibrewer opened 7 months ago

kenibrewer commented 7 months ago

Feature type

General description of the proposed functionality

Story: As a pycytominer user, I would like to receive more descriptive error messages about problems with my data. Pycytominer could automatically generate a DataframeSchema to check for the column names I specified in arguments and make sure there aren't NaN or inf values for operations where that will cause errors. By returning an error message with the specific column and row that contain problematic values, I will be more easily able to work with large distributed datasets.

Feature example

Coming

Alternative Solutions

No response

Additional information

No response