Open mhagel-theorem opened 1 month ago
Thank you for opening your first issue here! ðŸ›
I am happy to investigate this a bit further/deeper when I have the time @eapolinario
@mhagel-theorem , thank you. If you can share more details about your setup (other package versions, the shape of your dataframe, or how you're actually using FlyteFile
to read/write, which cloud provider, etc).
More than happy to collaborate on this one as I tried to repro, but couldn't yet.
Describe the bug
Versions:
flytekit
: 1.13.11flyteidl
: 1.13.5flytecore
chart: 1.12.0When using inter-Flyte task file IO for both pickles and LightGBM binary datasets,
fsspec
version 2024.10.0 results in incomplete/truncated writes. It is expected that this would be an issue with all file-based IO. Downstream tasks fail as a result with a respective incomplete read error.Reverting to
fsspec
version 2024.9.0 resolves all issues we have had with incomplete writes and Flyte inter-task IO. If this can be reproduced, recommend pinning Flyte's version to<=2024.9.0
.Expected behavior
File-based writing and IO does not fail to complete writing files.
Additional context to reproduce
No response
Screenshots
No response
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
Edit: Added versions for reproducibility