Open h-vetinari opened 1 year ago
Hmm, arrow-flight-test and arrow-flight-transport-ucx-test also assume CUDA when UCX is enabled; we could/should perhaps try to detect when we have CUDA but no device and skip those tests.
macOS seems broken in terms of Flight.
Windows flight tests seem to only crash in the CUDA-enabled pipeline. Possibly the same thing as linux?
Update for v13.0.0; looks pretty much the same as before, slightly better on windows.
linux-64:
The following tests FAILED:
35 - arrow-utility-test (Failed)
54 - arrow-cuda-test (Failed)
69 - arrow-gcsfs-test (Failed)
70 - arrow-s3fs-test (Failed)
73 - arrow-flight-test (Failed)
74 - arrow-flight-transport-ucx-test (Failed)
osx-64:
The following tests FAILED:
68 - arrow-gcsfs-test (Failed)
69 - arrow-s3fs-test (Failed)
72 - arrow-flight-test (Failed)
73 - arrow-flight-sql-test (Failed)
win-64:
The following tests FAILED:
19 - arrow-compute-scalar-cast-test (Failed)
22 - arrow-compute-scalar-temporal-test (Failed)
41 - arrow-substrait-substrait-test (Failed)
65 - arrow-dataset-file-orc-test (Failed)
68 - arrow-gcsfs-test (Failed)
69 - arrow-s3fs-test (Failed)
80 - arrow-orc-adapter-test (Failed)
90 - gandiva-internals-test (Failed)
99 - gandiva-date-time-test (Failed)
Hmm, I'm only seeing this issue, but when reporting test failures, it's more helpful to report the detailed log. You should always enable --output-on-failure
with ctest.
You should always enable
--output-on-failure
with ctest.
It's enabled in https://github.com/conda-forge/arrow-cpp-feedstock/pull/1058, but I prefer not to spam 1000s of lines before people ask for more details.
From here
ctest --progress --output-on-failure
on linux-64[----------] 1 test from TestMinioServer
[ RUN ] TestMinioServer.Connect
/home/conda/feedstock_root/build_artifacts/apache-arrow_1692867254866/work/cpp/src/arrow/filesystem/s3fs_test.cc:182: Failure
Failed
'InitServerAndClient()' failed with IOError: Failed to find minio executable ('minio') in PATH
[ FAILED ] TestMinioServer.Connect (1 ms)
Well...
Sure, as I said, this is from a test before minio was available. Which is the reason we're not regularly running the C++ tests in our CI (i.e. I'm talking about this issue and not https://github.com/apache/arrow/issues/37692)
Looking at the output of gcsfs_test.cc
, it all ends with
Value of: Testbench()->running()
Actual: false
Expected: true
Looking at that file, it seems this needs the testbench
binary:
https://github.com/apache/arrow/blob/c49e24273160ac1ce195f02dbd14acd7d0f6945e/cpp/src/arrow/filesystem/gcsfs_test.cc#L110-L111
which we don't have in conda-forge yet. I think it would make sense to skip those tests if testbench
cannot be found?
which we don't have in conda-forge yet. I think it would make sense to skip those tests if
testbench
cannot be found?
The problem is that it's too easy to end up not testing GCS by mistake. Perhaps an opt-out setting somewhere?
(but otherwise, perhaps can you pip install
it from the test script? is that forbidden?)
Rebased https://github.com/conda-forge/arrow-cpp-feedstock/pull/1058 on arrow 14 and added minio. Still failing a couple tests, here for linux:
The following tests FAILED:
34 - arrow-utility-test (Failed)
68 - arrow-gcsfs-test (Failed)
69 - arrow-s3fs-test (Failed)
arrow-gcsfs-test
is expected due to not having testbench
(see above), so that's less of an issue ATM. For arrow-utility-test
, we have a single failure:
[ RUN ] TimestampParser.StrptimeZoneOffset
$SRC_DIR/cpp/src/arrow/util/value_parsing_test.cc:811: Failure
Expected equality of these values:
expected
Which is: 1514769420
converted
Which is: 1514769408
Google Test trace:
$SRC_DIR/cpp/src/arrow/util/value_parsing_test.cc:806: 2018-01-01 00:00:00-0117
[ FAILED ] TimestampParser.StrptimeZoneOffset (0 ms)
which -- based on the values -- kinda looks like it's a question of leap seconds being applied or not (though the -0117 "time zone" looks very weird too). Not sure if this needs a newer linux distro than we're running...?
While arrow-s3fs-test
can now call minio
, it still fails pretty much all tests that call it AFAICT, for example:
[ RUN ] TestS3FS.CreateDir
$SRC_DIR/cpp/src/arrow/filesystem/s3fs_test.cc:919: Failure
Failed
Expected 'fs_->CreateDir("bucket/somefile")' to fail with IOError, but got OK
[ FAILED ] TestS3FS.CreateDir (84 ms)
[ RUN ] TestS3FSGeneric.DeleteDir
$SRC_DIR/cpp/src/arrow/filesystem/test_util.cc:77: Failure
Expected equality of these values:
paths
Which is: { "AB", "AB/CD", "AB/CD/EF", "AB/GH" }
expected_paths
Which is: { "AB", "AB/GH" }
[ FAILED ] TestS3FSGeneric.DeleteDir (78 ms)
This might be related to the fact that we only have a newer minio in conda-forge than what arrow uses (see https://github.com/apache/arrow/issues/37692). Though I also see
WARNING: MINIO_ACCESS_KEY and MINIO_SECRET_KEY are deprecated.
Please use MINIO_ROOT_USER and MINIO_ROOT_PASSWORD
so it could be something related to not having some AWS dummy account somewhere to perform operations on?
There's more errors on the CUDA builds (which presumably try to look for a GPU and fail), but that's not the first priority now.
On OSX, we get:
The following tests FAILED:
68 - arrow-gcsfs-test (Failed)
69 - arrow-s3fs-test (Failed)
In particular, the arrow-utility-test
does not fail like on linux.
On windows, we have more failures:
The following tests FAILED:
18 - arrow-compute-scalar-cast-test (Failed)
21 - arrow-compute-scalar-temporal-test (Failed)
64 - arrow-dataset-file-orc-test (Failed)
68 - arrow-gcsfs-test (Failed)
69 - arrow-s3fs-test (Failed)
80 - arrow-orc-adapter-test (Failed)
81 - arrow-substrait-substrait-test (Failed)
91 - gandiva-internals-test (Failed)
100 - gandiva-date-time-test (Failed)
Here's a link to the CI run.
Update for 16.1 (logs), I was quite stunned that the test suite passes on windows. 🥳
On linux:
The following tests FAILED:
36 - arrow-utility-test (Failed)
71 - arrow-gcsfs-test (Failed)
72 - arrow-azurefs-test (Failed)
73 - arrow-s3fs-test (Failed)
76 - arrow-flight-test (Failed)
On osx:
The following tests FAILED:
71 - arrow-gcsfs-test (Failed)
72 - arrow-azurefs-test (Failed)
73 - arrow-s3fs-test (Failed)
The arrow-gcsfs-test
failure can be ignored (no testbench
module), the azure tests all fail with:
$SRC_DIR/cpp/src/arrow/filesystem/azurefs_test.cc:425: Failure
Failed
'_error_or_value27.status()' failed with Invalid: Could not find Azurite emulator.
and the s3fs tests fail because they expect failure which doesn't happen:
$SRC_DIR/cpp/src/arrow/filesystem/test_util.cc:244: Failure
Failed
Expected 'fs->CreateDir("AB/def/EF/GH", true )' to fail with IOError, but got OK
The utility failure is - as above - related to some timestamp difference (missing leapseconds?) and the flight error is:
76/98 Test #76: arrow-flight-test ............................***Failed 0.28 sec
Running arrow-flight-test, redirecting output into $SRC_DIR/cpp/build/build/test-logs/arrow-flight-test.txt (attempt 1/1)
$SRC_DIR/cpp/build-support/run-test.sh: line 88: 28938 Aborted (core dumped) $TEST_EXECUTABLE "$@" > $LOGFILE.raw 2>&1
Running main() from $SRC_DIR/googletest/src/gtest_main.cc
The Azure test failures can be ignored too because the Azure test require https://github.com/Azure/Azurite .
Update for arrow 17.0:
The following tests FAILED:
36 - arrow-utility-test (Failed)
71 - arrow-gcsfs-test (Failed)
73 - arrow-s3fs-test (Failed)
The following tests FAILED:
71 - arrow-gcsfs-test (Failed)
73 - arrow-s3fs-test (Failed)
passes 🥳
I got rid of the failures in azurefs
by installing azurite before running the test suite; aside from that, the remaining failures are pretty much the same as before - gcsfs-test
fails because we don't have testbench
, and s3fs expects more failures than it's getting, e.g.
Expected 'fs->CopyFile("AB/abc", "def/mno")' to fail with IOError, but got OK
specifically
[ FAILED ] 4 tests, listed below:
[ FAILED ] TestS3FS.CreateDir
[ FAILED ] TestS3FSGeneric.CreateDir
[ FAILED ] TestS3FSGeneric.MoveFile
[ FAILED ] TestS3FSGeneric.CopyFile
The remaining error (linux-only) in utility-test
is
[ RUN ] TimestampParser.StrptimeZoneOffset
$SRC_DIR/cpp/src/arrow/util/value_parsing_test.cc:852: Failure
Expected equality of these values:
expected
Which is: 1514769420
converted
Which is: 1514769408
Google Test trace:
$SRC_DIR/cpp/src/arrow/util/value_parsing_test.cc:847: 2018-01-01 00:00:00-0117
[ FAILED ] TimestampParser.StrptimeZoneOffset (0 ms)
which feels like a corner case as well.
All in all, I think we're getting kinda close in activating this in the feedstock. Any thoughts on that? @kou @pitrou @raulcd @assignUser @jorisvandenbossche
Does it use musl as libc?
No, we don't support musl as a target, everything is glibc (though relatively old by default; 2.17).
Hmm. Could you share the CI log URL for the TimestampParser.StrptimeZoneOffset
failure?
Sure: logs
and s3fs expects more failures than it's getting, e.g.
I just saw the following comment, which confirms that this is due to conda-forge using a newer minio version than arrow (because we don't have one as old as arrow is using, c.f. #37692 #41922)
Thanks. It seems that -0117
timezone didn't work.
Can we try newer glibc to check whether this is related to glibc version or not?
Can we try newer glibc to check whether this is related to glibc version or not?
The test passes on an alma8 image with glibc 2.28 (logs)
Thanks. It seems that -0117
isn't supported with old glibc.
I'll add a GTEST_SKIP()
later.
I updated https://github.com/conda-forge/arrow-cpp-feedstock/pull/1058 for v18, opened a PR for the failure of TimestampParser.StrptimeZoneOffset
with old glibc (and backported that), skipped the gcsfs tests (which cannot pass without testbench) and what I'm getting with all that is that windows and osx are passing (🥳), while on linux there are two failures:
The following tests FAILED:
43 - arrow-csv-test (Failed)
76 - arrow-flight-test (Failed)
In more detail:
[ RUN ] TimestampConversion.UserDefinedParsersWithZone
$SRC_DIR/cpp/src/arrow/csv/converter_test.cc:169: Failure
Failed
Expected 'converter->Convert(*parser, i)' to fail with Invalid, but got OK
$SRC_DIR/cpp/src/arrow/csv/converter_test.cc:169: Failure
Failed
Expected 'converter->Convert(*parser, i)' to fail with Invalid, but got OK
[ FAILED ] TimestampConversion.UserDefinedParsersWithZone (0 ms)
(this is potentially due to https://github.com/apache/arrow/pull/44621, as I haven't seen it before)
The more serious one is the flight test, which just crashes/aborts
[----------] 3 tests from GrpcAsyncClientTest
[ RUN ] GrpcAsyncClientTest.TestGetFlightInfo
[ OK ] GrpcAsyncClientTest.TestGetFlightInfo (3 ms)
[ RUN ] GrpcAsyncClientTest.TestGetFlightInfoFuture
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1730526628.609332 29678 thd.cc:184] pthread_join failed: Resource deadlock avoided
~/feedstock_root/build_artifacts/apache-arrow_1730522866415/work/cpp/build/src/arrow/flight
Describe the bug, including details regarding any error messages, version, and platform.
As I explain in https://github.com/conda-forge/arrow-cpp-feedstock/pull/1058, I wanted to enable the gtest suite for
libarrow
. The result is as follows:linux-64:
osx-64:
win-64:
The
arrow-cuda-test
failure is probably unavoidable (as we have no CUDA drivers in CI), but the others shouldn't have to fail AFAICT.More detailed logs can be found here
Component(s)
C++, Packaging