Open AdrienDart opened 1 month ago
No, this looks like a bug, marimo should detect whether the object is serializable in the way it expects. This exception is thrown when there's that discrepancy. There's a bit of dataframe checking logic under the hood, so I think this might be solved by moving that logic to narwhals
Thanks for the easily reproducible code. You may be able to get around this by putting defining df in a separate cell in the meantime.
Also, quick question, I notice the cached dataframe is saved as a pickle, could it be saved as a parquet for better performance/memory usage? Thanks for your help!
Sure, I don't think any given file format should replace pickle, but maybe we'll expose a setting to choose a "loader" type.
Here's the pickle loader for your reference, I don't think it'd be too tricky to implement for any given storage type:
https://github.com/marimo-team/marimo/blob/main/marimo/_save/loaders/pickle.py
Couple other thoughts were npz, dill, and remote cache.
If you did want to play with this, the undocumented keyword arg _loader
would let you inject a loader instance. You can see how we do this in testing: https://github.com/marimo-team/marimo/blob/45056be4c37ed79e28370222f2e7bd89c017050c/tests/_save/test_cache.py#L49
Describe the bug
Hi,
I'm trying to save a polars dataframe in cache using the following operation.
I get TypeError("Cannot change data-type for object array.") (sorry I can't post the whole traceback, issue at line 217 in data_to_buffer in hash.py) Is that expected?
A monkey patch that works is:
Thanks,
Adrien
Environment
Marimo 0.9.10
Code to reproduce
See above.