data-engineering-collective / plateau

Flat files, flat land.
MIT License
23 stars 8 forks source link

Do not test pickle in shuffle #79

Closed fjetter closed 1 year ago

fjetter commented 1 year ago

Pickling graphs is not the best idea. The pickle protocol is not properly implemented for many graph subtypes sometimes causing annotation loss or other things.

This will change with https://github.com/dask/distributed/pull/7564 which is expected to go in before the next release. Still, I don't think it's necessary to test this here.

If somebody feels strongly about this I can add a dask version guard for this assertion

fjetter commented 1 year ago

There appears to be a second failure connected to P2P. Somehow dask is picking this even though it shouldn't.

fjetter commented 1 year ago

ok, the tests are using distributed after all and the set_index tests are therefore picking P2P. The assert_eq, however, is picking a sync scheduler which is not implementing P2P which caused the last exception. This should now work

fjetter commented 1 year ago

There are some test failures in test_index_store_roundtrip_ts around index roundtrips. This must be unrelated