apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
6.33k stars 1.2k forks source link

Update arrow/parquet to arrow/parquet `53.3.0` #13508

Closed alamb closed 1 day ago

alamb commented 1 day ago

Which issue does this PR close?

Rationale for this change

There are some other fixes waiting on this arrow release so let's get them integrated into DataFusion

What changes are included in this PR?

Update to latest arrow/parquet

Are these changes tested?

By CI

Are there any user-facing changes?

jayzhan211 commented 1 day ago

Why is CI forced to run in arrow 53.3 🤔 ?

jayzhan211 commented 1 day ago

@findepi It seems https://github.com/apache/datafusion/issues/13304 is not yet resolved but we have related test for preventing regression. Unfortunately, this is also blocking the CI to pass.

I think we should either find a way to keep CI run in 53.2 version or remove the related test.

jayzhan211 commented 1 day ago

I push the fix since this is blocking CI.

Since https://github.com/apache/datafusion/issues/13291 looks like in discussion, so I fix up the Like test which is consistent with Postgres. We can revert this to old result if there is consensus

jonahgao commented 1 day ago

Thanks @alamb @jayzhan211

jonahgao commented 1 day ago

Why is CI forced to run in arrow 53.3 🤔 ?

It seems that CI uses the latest version of dependencies that are semver compatible

alamb commented 18 hours ago

Why is CI forced to run in arrow 53.3 🤔 ?

My thinking was that explicitly using arrow 53.3 will ensure that any bugs we fix that rely on arrow 53.3 features will work correctly.

If we don't update DataFusion to explicitly use 53.3.0 in Cargo.toml, CI will run (and use 53.3.0) but other projects with DataFusion could potentially use 53.2.0. If we add code to DataFusion that relies on 53.3.0 (like some of the LIKE fixes) this would be broken

alamb commented 18 hours ago

Thanks @jayzhan211