sibyl-dev / pyreal

An easier approach to using and understanding ML models
MIT License
20 stars 1 forks source link

Add a "no format" option to realapps #519

Closed zyteka closed 1 year ago

zyteka commented 1 year ago

Profiling has revealed that formatting outputs can be a major source of slow-down for RealApp objects, especially for LFC and SE. Formatting is only needed if users are looking at individual rows, and therefore not needed for large datasets. This PR makes it possible to skip formatting for a produce call, instead returning the simplest output format without any row_ids. This can be used when identifying individual rows is not necessary, and users just need large numbers of explanations.

For this PR, I've added this functionality for LFC and GFI. For SE, the original explainer format is already fairly slow and complex, so this fix will require adjustments to other parts of the code.

As part of this code, I'm also including the basic profiling code used to identify the issue. Future PRs will extend this profiling functionality.