ENH: Back pd.BooleanArray with nanoarrow

Feature Type

[X] Adding new functionality to pandas
[X] Changing existing functionality in pandas
[X] Removing existing functionality in pandas

Problem Description

The existing pd.arrays.BooleanArray serves a good purpose to allow True/False with missing values, but the current implementation is horribly inefficient. Coming from the historical NumPy perspective, the implementation uses twice as much memory. Compared to PyArrow the memory usage is 8x as much and computational algorithms can be up to 64x slower

Feature Description

The pd.arrays.BooleanArray could use nanoarrow behind the scenes for its implementation, rather than the existing NumPy approach.

I think the main technical challenges for this would be:

Build system integration. nanoarrow is already available in the Meson WrapDB and progress is underway with nanobind; probably worth waiting for the latter, but once complete this is less of a concern
2D support, if ever needed. You could try to simulate 2D indexing operations with a bitmask, but something like transposition (which are trivial with a bytemask) is a concept that does not translate well moving from bytes to bits . I don't know that this is a huge issue since the existing BooleanArray does not support 2D, but @jbrockmendel probably knows best on any plans for that

Alternative Solutions

status quo

Additional Context

No response

pandas-dev / pandas