statrs-dev / statrs

Statistical computation library for Rust
https://docs.rs/statrs/latest/statrs/
MIT License
544 stars 78 forks source link

Consider improving error handling and `StatsError` type #221

Open FreezyLemon opened 2 months ago

FreezyLemon commented 2 months ago

enum StatsError is supposed to be an

Enumeration of possible errors thrown within the statrs library.

This indicates that it should not try to be a generic error type for any kind of statistics calculations, but instead only concern itself with the errors produced in statrs.

With that basic assumption, there are currently some inconsistencies and outdated practices though (API guidelines for reference):

I realize that most of these are breaking changes, but seeing that the crate is pre-1.0, I don't think there's a big problem doing this.

Other things that could be improved: StatsError is big: 40 bytes on a Linux x64 target. This is because there are variants which contain 2x &str (2 x 16 = 32 bytes plus discriminant and padding). Is it really necessary to have strings in the error type? The implementation could be replaced mostly 1:1 with some sort of ArgName enum, but there might be an even better solution that does not need this argument name wrapping at all.

All new functions seem to just return StatsError::BadParams whenever the params are outside of the defined/allowed range. Is there a good reason for these to be so vague when compared to the more specific errors returned by other functions? After all, the more specific errors already exist, why not return more exact error information? There might even be value in providing multiple error types, to have errors that are more specific to the exact API being used.

YeungOnion commented 2 months ago

I do see good reason for all of these and I'd be open to making changes for all but the infallible new not returning a result.

All other public structs defined in the distribution module have a new method implemented that returns Result so it does provide consistency. Perhaps if Empirical were in a different module or it's new had a different name?

FreezyLemon commented 2 months ago

All other public structs defined in the distribution module have a new method implemented that returns Result so it does provide consistency. Perhaps if Empirical were in a different module or it's new had a different name?

I see your point about consistency. I would personally value the expressiveness ("this call cannot fail") over the consistency ("all constructors return a Result and might need error handling"), but it doesn't matter much tbh.

Hmm I'm not sure about renaming the new function, it's a widespread naming convention in the Rust ecosystem and what most users would expect. Maybe a similar name like new_<something> so people can quickly find it in their IDEs.

YeungOnion commented 2 months ago

Perhaps an impl Default over having Empirical::new?

Regardless, the overall discussion you bring up on the error type is valid. I'd merge an effort that does any of

YeungOnion commented 1 week ago

During these changes, we should also work on removing support for degenerate distributions #102 and returning errors.