Open marwan116 opened 3 weeks ago
I noticed similar issue and I think the bottle neck is _concat_same_type in arrow block, https://github.com/ray-project/ray/pull/45075. Also I have PR for this but that will need to upgrade the pyarrow version. I think this is a long time issue and we need to make decision if we should upgrade the pyarrow version so we can have better implementation for this convert part.
What happened + What you expected to happen
I am bumping into a regression which seems like a bug to me due to the introduction of this code change in Ray 2.33.0 from what I can tell
The behavior of
write_json
has regressed from properly respecting python objects like list, to now truncating a list into its first element only.See the reproduction script below.
Note the warning:
And note how the embeddings get truncated:
Versions / Dependencies
Reproduction script
whereas with a version of ray <= 2.32.0 here is what I get:
Issue Severity
Medium: It is a significant difficulty but I can work around it.