data-apis / dataframe-interchange-tests

Test suite for the dataframe interchange protocol
MIT License
4 stars 2 forks source link

Add tests with sliced dataframes #16

Open jorisvandenbossche opened 1 year ago

jorisvandenbossche commented 1 year ago

The spec has an offset property on the Column in case you need to take this into account about how to interpret the buffer (eg in case of slices). It would be good to add tests for this.

(for example, pandas currently ignores this)

honno commented 1 year ago

Yeah, I'd hope at least we could manipulate columns with kinds of offset values for smoke testing. It's just manipulating offsets seems pretty tricky, as it requires a thorough understanding of how a given protocol adopter makes interchange dataframes/columns in the first place. My initial impression is I won't get around to this given the effort-to-reward.

jorisvandenbossche commented 1 year ago

I am not sure we should be manipulating offsets of column objects ourselves (i.e. directly in the tests), as that would indeed get complicated. But this could also be covered in roundtrip tests. Basically, what I would expect is something like:

orig_df = ..
orig_df_sliced = slice_method(orig_df, 0, 2)
dest_df = dest_libinfo.from_dataframe(orig_df_sliced)
roundtrip_df = orig_libinfo.from_dataframe(dest_df)
assert orig_libinfo.frame_equal(roundtrip_df, orig_df)

so we probably only need a method on the LibraryInfo wrapper.py to slice a dataframe, and then run roundtrip tests with that?