kuzudb / kuzu

Embeddable property graph database management system built for query speed and scalability. Implements Cypher.
https://kuzudb.com/
MIT License
1.38k stars 97 forks source link

Rust API test_arrow test fails due to Utf8Error on macos-11 GitHub-hosted runner #2005

Closed russell-liu closed 1 year ago

russell-liu commented 1 year ago

The test has failed 2 out of 2 times so far. Logs and workflow files can be found here: https://github.com/kuzudb/kuzu/actions/runs/6111754214/job/16587694174 https://github.com/kuzudb/kuzu/actions/runs/6113068279/job/16591808577

russell-liu commented 1 year ago

I think I've figured out why it is failing -- the macos-11 runner uses Clang/LLVM 13.0.0 by default (https://github.com/actions/runner-images/blob/main/images/macos/macos-11-Readme.md), whereas we inexplicitly require 14. This is demonstrated by these two runs, one of which failed and one of which succeeded, which differ only by Clang/LLVM version: https://github.com/kuzudb/kuzu/actions/runs/6122821358/job/16619417254 https://github.com/kuzudb/kuzu/actions/runs/6122418640/job/16618126207

russell-liu commented 1 year ago

Never mind, it can also fail with Clang/LLVM 14: https://github.com/kuzudb/kuzu/actions/runs/6122975487/job/16619878211

benjaminwinger commented 1 year ago

As some context for the error, it's being raised by this line in arrow-rs, where it's accessing the name field of the schema.

That name field should either be what's set here as a hardcoded ascii string, or one of the property names as set here (the rust test failing has a string column called name and an int64 column called age).

I think the problem is that the property names are owned by the DataTypeInfo struct, rather than being copied into the ArrowSchema like the rest of the information, and the DataTypeInfo used in getArrowSchema is temporary. Struct field names are copied into the ArrowSchemaHolder, but property names are not.

We should also add a tests of getArrowSchema and getNextArrowChunk, as issues like this could be caught by the address sanitizer (it may not be needed for the rust code, but it is needed for C++ interfaces used by the rust code).

russell-liu commented 1 year ago

It also happens on Windows as well: https://github.com/kuzudb/kuzu/actions/runs/6189154608/job/16802666644

benjaminwinger commented 1 year ago

Yes, it can occur on all platforms, but I think the last time you updated the branch you're testing on was the commit before #2019 was merged.

russell-liu commented 1 year ago

Oops, I had this page open and forgot to refresh it, so I wasn't aware that it has been fixed. Please ignore my previous comment.