BUG: `DataFrame.sparse.from_spmatrix` hard codes an invalid ``fill_value`` for certain subtypes

christopher-titchen commented 1 week ago

[x] Closes #59063.
[x] Tests added and passed if fixing a bug or adding a new feature.
[x] All code checks passed.
[x] Added type annotations to new arguments/methods/functions.
[x] Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

mroeschke commented 5 days ago

Thanks @christopher-titchen

bmreiniger commented 8 hours ago

This seems to be responsible for a breaking change in a workflow of mine. We consume the output of a sklearn OneHotEncoder, which is sparse with float type, and instantiate a sparse pandas frame from it. That used to produce values of 1.0 and 0.0, and now produces instead 1.0 and np.nan.

It doesn't look like the sparse instantiation allows the fill_value; is there another easy way we can adjust to the new behavior? (Casting to integers would be fine for this particular case, although our code is more generic than just OneHotEncoder results, so I'm not positive that's generalizable.)

pandas-dev / pandas

BUG: `DataFrame.sparse.from_spmatrix` hard codes an invalid ``fill_value`` for certain subtypes #59064