opendp / opendp

The core library of differential privacy algorithms powering the OpenDP Project.
https://opendp.org
MIT License
284 stars 46 forks source link

Check error message if int type mismatch between polars and domain #1504

Open mccalluc opened 4 weeks ago

mccalluc commented 4 weeks ago

Michael writes:

An annoying aspect is that Polars defaults to i64, but the default inferred type of int is i32. So if you don't specify the types exactly, then the domains will each expect i32 but the data (a few lines below) is actually i64.

We should have a test that emulates this mistake, and makes sure users can get back on the right path.

Shoeboxam commented 3 weeks ago

We should reconsider having our default int dtype be i32. I think R generally considers this to be a mistake they are stuck with: https://www.r-bloggers.com/2015/06/r-in-a-64-bit-world/ Numpy 2.0 changed the default dtype on windows to 64-bit: https://numpy.org/devdocs/release/2.0.0-notes.html And of course Polars defaults to i64.

The motivation for this was to be consistent with Rust's default dtype, but I think data science generally prefers i64s.