apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://arrow.apache.org/
Apache License 2.0
14.28k stars 3.47k forks source link

[R] Some errors in tests on Darwin PPC due to locale and datetime: [ FAIL 11 | WARN 16 | SKIP 111 | PASS 6586 ] #35083

Open barracuda156 opened 1 year ago

barracuda156 commented 1 year ago

Describe the bug, including details regarding any error messages, version, and platform.

So there are these errors (locale one happens with several tests):

Error in `compute.arrow_dplyr_query(x)`: Invalid: Cannot find locale 'en_US.UTF-8': locale::facet::_S_create_c_locale name not valid

── Failure ('test-dplyr-funcs-datetime.R:2079:3'): parse_date_time() works with year, month, and date components ──
`object` (`actual`) not equal to `expected` (`expected`).
Caused by error in `C_time_floor()`:
! CCTZ: Invalid timezone of the input vector: "Asia/Kathmandu"

── Error ('test-dplyr-funcs-datetime.R:3552:3'): timestamp rounding takes place in local time ──
<dplyr:::mutate_error/rlang_error/error/condition>

── Error ('test-s3-minio.R:18:1'): (code run outside of `test_that()`) ─────────
Error: invalid version specification '10.0.0d2'

P. S. Version is correct, of course, test it wrong here.

Complete output:

R version 4.2.3 (2023-03-15) -- "Shortstop Beagle"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: powerpc-apple-darwin10.0.0d2 (32-bit)

> library(testthat)
> library(arrow)
Some features are not enabled in this build of Arrow. Run `arrow_info()` for more information.

Attaching package: 'arrow'

The following object is masked from 'package:testthat':

    matches

The following object is masked from 'package:utils':

    timestamp

> library(tibble)
>
> verbose_test_output <- identical(tolower(Sys.getenv("ARROW_R_DEV", "false")), "true") ||
+   identical(tolower(Sys.getenv("ARROW_R_VERBOSE_TEST", "false")), "true")
>
> if (verbose_test_output) {
+   arrow_reporter <- MultiReporter$new(list(CheckReporter$new(), LocationReporter$new()))
+ } else {
+   arrow_reporter <- check_reporter()
+ }
> test_check("arrow", reporter = arrow_reporter)
sh: /bin/ps: Operation not permitted
[ FAIL 11 | WARN 16 | SKIP 111 | PASS 6586 ]

══ Skipped tests ═══════════════════════════════════════════════════════════════
• ARROW-12632: ExecuteScalarExpression cannot Execute non-scalar expression (1)
• ARROW-13364 (1)
• ARROW-14045 (1)
• ARROW-17043 (date/datetime arithmetic with integers) (1)
• ARROW-18101 (1)
• Arrow C++ not built with dataset (31)
• Arrow C++ not built with gcs (1)
• Arrow C++ not built with json (1)
• Arrow C++ not built with parquet (16)
• Arrow C++ not built with substrait (1)
• Flight server is not running (1)
• Ingest_POSIXct only implemented for REALSXP (1)
• Need halffloat support: https://issues.apache.org/jira/browse/ARROW-3802 (1)
• On CRAN (41)
• RE2 does not support backreferences in pattern (https://github.com/google/re2/issues/101) (1)
• TODO (ARROW-16630): make sure BottomK can handle NA ordering (1)
• TODO: (if anyone uses RangeEquals) (1)
• TODO: ARROW-14071 (1)
• Table with 0 cols doesn't know how many rows it should have (1)
• This OS either does not support changing languages to fr or it caches translations (2)
• Work around masking of data type functions (ARROW-12322) (1)
• environment variable ARROW_LARGE_MEMORY_TESTS (1)
• floor_date(as.Date(NA), '1 day') is no longer NA on latest R-devel (1)
• pyarrow not available for testing (1)
• tolower(Sys.info()[["sysname"]]) != "windows" is TRUE (1)

══ Failed tests ════════════════════════════════════════════════════════════════
── Error ('test-dplyr-funcs-datetime.R:375:3'): strftime ───────────────────────
Error in `compute.arrow_dplyr_query(x)`: Invalid: Cannot find locale 'en_US.UTF-8': locale::facet::_S_create_c_locale name not valid
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc:1174  GetLocale(options.locale)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc:1181  Make(ctx, *in.type)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec.cc:857  kernel_->exec(kernel_ctx_, input, &output)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec/expression.cc:607  executor->Execute( ExecBatch(std::move(arguments), all_scalar ? 1 : input.length), &listener)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec/project_node.cc:91  ExecuteScalarExpression(simplified_expr, target, plan()->query_context()->exec_context())
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/record_batch.cc:334  ReadNext(&batch)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/record_batch.cc:348  ToRecordBatches()
Backtrace:
     ▆
  1. ├─arrow:::compare_dplyr_binding(...) at test-dplyr-funcs-datetime.R:375:2
  2. │ ├─testthat::expect_warning(...) at tests/testthat/helper-expectation.R:94:2
  3. │ │ └─testthat:::expect_condition_matching(...)
  4. │ │   └─testthat:::quasi_capture(...)
  5. │ │     ├─testthat (local) .capture(...)
  6. │ │     │ └─base::withCallingHandlers(...)
  7. │ │     └─rlang::eval_bare(quo_get_expr(.quo), quo_get_env(.quo))
  8. │ └─rlang::eval_tidy(expr, rlang::new_data_mask(rlang::env(.input = arrow_table(tbl))))
  9. ├─... %>% collect()
 10. ├─dplyr::collect(.)
 11. └─arrow:::collect.arrow_dplyr_query(.)
 12.   └─arrow:::compute.arrow_dplyr_query(x)
 13.     └─base::tryCatch(...)
 14.       └─base (local) tryCatchList(expr, classes, parentenv, handlers)
 15.         └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
 16.           └─value[[3L]](cond)
 17.             └─arrow:::augment_io_error_msg(e, call, schema = schema())
 18.               └─rlang::abort(msg, call = call)
── Error ('test-dplyr-funcs-datetime.R:631:3'): extract month from timestamp ───
Error in `compute.arrow_dplyr_query(x)`: Invalid: Cannot find locale 'en_US.UTF-8': locale::facet::_S_create_c_locale name not valid
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc:1174  GetLocale(options.locale)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc:1181  Make(ctx, *in.type)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec.cc:857  kernel_->exec(kernel_ctx_, input, &output)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec/expression.cc:607  executor->Execute( ExecBatch(std::move(arguments), all_scalar ? 1 : input.length), &listener)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec/expression.cc:589  ExecuteScalarExpression(call->arguments[i], input, exec_context)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec/project_node.cc:91  ExecuteScalarExpression(simplified_expr, target, plan()->query_context()->exec_context())
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/record_batch.cc:334  ReadNext(&batch)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/record_batch.cc:348  ToRecordBatches()
Backtrace:
     ▆
  1. ├─arrow:::compare_dplyr_binding(...) at test-dplyr-funcs-datetime.R:631:2
  2. │ ├─testthat::expect_warning(...) at tests/testthat/helper-expectation.R:94:2
  3. │ │ └─testthat:::expect_condition_matching(...)
  4. │ │   └─testthat:::quasi_capture(...)
  5. │ │     ├─testthat (local) .capture(...)
  6. │ │     │ └─base::withCallingHandlers(...)
  7. │ │     └─rlang::eval_bare(quo_get_expr(.quo), quo_get_env(.quo))
  8. │ └─rlang::eval_tidy(expr, rlang::new_data_mask(rlang::env(.input = arrow_table(tbl))))
  9. ├─... %>% collect()
 10. ├─dplyr::collect(.)
 11. └─arrow:::collect.arrow_dplyr_query(.)
 12.   └─arrow:::compute.arrow_dplyr_query(x)
 13.     └─base::tryCatch(...)
 14.       └─base (local) tryCatchList(expr, classes, parentenv, handlers)
 15.         └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
 16.           └─value[[3L]](cond)
 17.             └─arrow:::augment_io_error_msg(e, call, schema = schema())
 18.               └─rlang::abort(msg, call = call)
── Error ('test-dplyr-funcs-datetime.R:713:3'): extract wday from timestamp ────
Error in `compute.arrow_dplyr_query(x)`: Invalid: Cannot find locale 'en_US.UTF-8': locale::facet::_S_create_c_locale name not valid
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc:1174  GetLocale(options.locale)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc:1181  Make(ctx, *in.type)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec.cc:857  kernel_->exec(kernel_ctx_, input, &output)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec/expression.cc:607  executor->Execute( ExecBatch(std::move(arguments), all_scalar ? 1 : input.length), &listener)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec/expression.cc:589  ExecuteScalarExpression(call->arguments[i], input, exec_context)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec/project_node.cc:91  ExecuteScalarExpression(simplified_expr, target, plan()->query_context()->exec_context())
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/record_batch.cc:334  ReadNext(&batch)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/record_batch.cc:348  ToRecordBatches()
Backtrace:
     ▆
  1. ├─arrow:::compare_dplyr_binding(...) at test-dplyr-funcs-datetime.R:713:2
  2. │ ├─testthat::expect_warning(...) at tests/testthat/helper-expectation.R:94:2
  3. │ │ └─testthat:::expect_condition_matching(...)
  4. │ │   └─testthat:::quasi_capture(...)
  5. │ │     ├─testthat (local) .capture(...)
  6. │ │     │ └─base::withCallingHandlers(...)
  7. │ │     └─rlang::eval_bare(quo_get_expr(.quo), quo_get_env(.quo))
  8. │ └─rlang::eval_tidy(expr, rlang::new_data_mask(rlang::env(.input = arrow_table(tbl))))
  9. ├─... %>% collect()
 10. ├─dplyr::collect(.)
 11. └─arrow:::collect.arrow_dplyr_query(.)
 12.   └─arrow:::compute.arrow_dplyr_query(x)
 13.     └─base::tryCatch(...)
 14.       └─base (local) tryCatchList(expr, classes, parentenv, handlers)
 15.         └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
 16.           └─value[[3L]](cond)
 17.             └─arrow:::augment_io_error_msg(e, call, schema = schema())
 18.               └─rlang::abort(msg, call = call)
── Error ('test-dplyr-funcs-datetime.R:894:3'): extract month from date ────────
Error in `compute.arrow_dplyr_query(x)`: Invalid: Cannot find locale 'en_US.UTF-8': locale::facet::_S_create_c_locale name not valid
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc:1174  GetLocale(options.locale)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc:1181  Make(ctx, *in.type)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec.cc:857  kernel_->exec(kernel_ctx_, input, &output)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec/expression.cc:607  executor->Execute( ExecBatch(std::move(arguments), all_scalar ? 1 : input.length), &listener)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec/expression.cc:589  ExecuteScalarExpression(call->arguments[i], input, exec_context)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec/project_node.cc:91  ExecuteScalarExpression(simplified_expr, target, plan()->query_context()->exec_context())
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/record_batch.cc:334  ReadNext(&batch)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/record_batch.cc:348  ToRecordBatches()
Backtrace:
     ▆
  1. ├─arrow:::compare_dplyr_binding(...) at test-dplyr-funcs-datetime.R:894:2
  2. │ ├─testthat::expect_warning(...) at tests/testthat/helper-expectation.R:94:2
  3. │ │ └─testthat:::expect_condition_matching(...)
  4. │ │   └─testthat:::quasi_capture(...)
  5. │ │     ├─testthat (local) .capture(...)
  6. │ │     │ └─base::withCallingHandlers(...)
  7. │ │     └─rlang::eval_bare(quo_get_expr(.quo), quo_get_env(.quo))
  8. │ └─rlang::eval_tidy(expr, rlang::new_data_mask(rlang::env(.input = arrow_table(tbl))))
  9. ├─... %>% collect()
 10. ├─dplyr::collect(.)
 11. └─arrow:::collect.arrow_dplyr_query(.)
 12.   └─arrow:::compute.arrow_dplyr_query(x)
 13.     └─base::tryCatch(...)
 14.       └─base (local) tryCatchList(expr, classes, parentenv, handlers)
 15.         └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
 16.           └─value[[3L]](cond)
 17.             └─arrow:::augment_io_error_msg(e, call, schema = schema())
 18.               └─rlang::abort(msg, call = call)
── Error ('test-dplyr-funcs-datetime.R:949:3'): extract wday from date ─────────
Error in `compute.arrow_dplyr_query(x)`: Invalid: Cannot find locale 'en_US.UTF-8': locale::facet::_S_create_c_locale name not valid
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc:1174  GetLocale(options.locale)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc:1181  Make(ctx, *in.type)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec.cc:857  kernel_->exec(kernel_ctx_, input, &output)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec/expression.cc:607  executor->Execute( ExecBatch(std::move(arguments), all_scalar ? 1 : input.length), &listener)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec/expression.cc:589  ExecuteScalarExpression(call->arguments[i], input, exec_context)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec/project_node.cc:91  ExecuteScalarExpression(simplified_expr, target, plan()->query_context()->exec_context())
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/record_batch.cc:334  ReadNext(&batch)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/record_batch.cc:348  ToRecordBatches()
Backtrace:
     ▆
  1. ├─arrow:::compare_dplyr_binding(...) at test-dplyr-funcs-datetime.R:949:2
  2. │ ├─testthat::expect_warning(...) at tests/testthat/helper-expectation.R:94:2
  3. │ │ └─testthat:::expect_condition_matching(...)
  4. │ │   └─testthat:::quasi_capture(...)
  5. │ │     ├─testthat (local) .capture(...)
  6. │ │     │ └─base::withCallingHandlers(...)
  7. │ │     └─rlang::eval_bare(quo_get_expr(.quo), quo_get_env(.quo))
  8. │ └─rlang::eval_tidy(expr, rlang::new_data_mask(rlang::env(.input = arrow_table(tbl))))
  9. ├─... %>% collect()
 10. ├─dplyr::collect(.)
 11. └─arrow:::collect.arrow_dplyr_query(.)
 12.   └─arrow:::compute.arrow_dplyr_query(x)
 13.     └─base::tryCatch(...)
 14.       └─base (local) tryCatchList(expr, classes, parentenv, handlers)
 15.         └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
 16.           └─value[[3L]](cond)
 17.             └─arrow:::augment_io_error_msg(e, call, schema = schema())
 18.               └─rlang::abort(msg, call = call)
── Error ('test-dplyr-funcs-datetime.R:1168:3'): month() supports integer input ──
Error in `compute.arrow_dplyr_query(x)`: Invalid: Cannot find locale 'en_US.UTF-8': locale::facet::_S_create_c_locale name not valid
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc:1174  GetLocale(options.locale)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc:1181  Make(ctx, *in.type)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec.cc:857  kernel_->exec(kernel_ctx_, input, &output)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec/expression.cc:607  executor->Execute( ExecBatch(std::move(arguments), all_scalar ? 1 : input.length), &listener)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec/expression.cc:589  ExecuteScalarExpression(call->arguments[i], input, exec_context)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec/project_node.cc:91  ExecuteScalarExpression(simplified_expr, target, plan()->query_context()->exec_context())
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/record_batch.cc:334  ReadNext(&batch)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/record_batch.cc:348  ToRecordBatches()
Backtrace:
     ▆
  1. ├─arrow:::compare_dplyr_binding(...) at test-dplyr-funcs-datetime.R:1168:2
  2. │ ├─testthat::expect_warning(...) at tests/testthat/helper-expectation.R:94:2
  3. │ │ └─testthat:::expect_condition_matching(...)
  4. │ │   └─testthat:::quasi_capture(...)
  5. │ │     ├─testthat (local) .capture(...)
  6. │ │     │ └─base::withCallingHandlers(...)
  7. │ │     └─rlang::eval_bare(quo_get_expr(.quo), quo_get_env(.quo))
  8. │ └─rlang::eval_tidy(expr, rlang::new_data_mask(rlang::env(.input = arrow_table(tbl))))
  9. ├─... %>% collect()
 10. ├─dplyr::collect(.)
 11. └─arrow:::collect.arrow_dplyr_query(.)
 12.   └─arrow:::compute.arrow_dplyr_query(x)
 13.     └─base::tryCatch(...)
 14.       └─base (local) tryCatchList(expr, classes, parentenv, handlers)
 15.         └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
 16.           └─value[[3L]](cond)
 17.             └─arrow:::augment_io_error_msg(e, call, schema = schema())
 18.               └─rlang::abort(msg, call = call)
── Failure ('test-dplyr-funcs-datetime.R:2079:3'): parse_date_time() works with year, month, and date components ──
`object` (`actual`) not equal to `expected` (`expected`).

actual vs expected
                 parsed_date_ymd parsed_date_ymd2
  actual[9, ]         2021-09-09       2021-09-09
  actual[10, ]        2021-09-10       2021-09-10
  actual[11, ]        2021-09-11       2021-09-11
- actual[12, ]                NA               NA
+ expected[12, ]      2021-09-12       2021-09-12
  actual[13, ]        2021-09-13       2021-09-13
  actual[14, ]                NA               NA

     actual$parsed_date_ymd | expected$parsed_date_ymd     
 [9] "2021-09-09"           | "2021-09-09"             [9]
[10] "2021-09-10"           | "2021-09-10"             [10]
[11] "2021-09-11"           | "2021-09-11"             [11]
[12] NA                     - "2021-09-12"             [12]
[13] "2021-09-13"           | "2021-09-13"             [13]
[14] NA                     | NA                       [14]

     actual$parsed_date_ymd2 | expected$parsed_date_ymd2     
 [9] "2021-09-09"            | "2021-09-09"              [9]
[10] "2021-09-10"            | "2021-09-10"              [10]
[11] "2021-09-11"            | "2021-09-11"              [11]
[12] NA                      - "2021-09-12"              [12]
[13] "2021-09-13"            | "2021-09-13"              [13]
[14] NA                      | NA                        [14]
Backtrace:
    ▆
 1. └─arrow:::compare_dplyr_binding(...) at test-dplyr-funcs-datetime.R:2079:2
 2.   └─arrow:::expect_equal(via_table, expected, ...) at tests/testthat/helper-expectation.R:101:2
 3.     └─testthat::expect_equal(...) at tests/testthat/helper-expectation.R:42:4
── Error ('test-dplyr-funcs-datetime.R:3552:3'): timestamp rounding takes place in local time ──
<dplyr:::mutate_error/rlang_error/error/condition>
Error in `mutate(., utc_floored = floor_date(utc_time, unit = unit), utc_rounded = round_date(utc_time,
    unit = unit), utc_ceiling = ceiling_date(utc_time, unit = unit),
    syd_floored = floor_date(syd_time, unit = unit), syd_rounded = round_date(syd_time,
        unit = unit), syd_ceiling = ceiling_date(syd_time, unit = unit),
    adl_floored = floor_date(adl_time, unit = unit), adl_rounded = round_date(adl_time,
        unit = unit), adl_ceiling = ceiling_date(adl_time, unit = unit),
    mar_floored = floor_date(mar_time, unit = unit), mar_rounded = round_date(mar_time,
        unit = unit), mar_ceiling = ceiling_date(mar_time, unit = unit),
    kat_floored = floor_date(kat_time, unit = unit), kat_rounded = round_date(kat_time,
        unit = unit), kat_ceiling = ceiling_date(kat_time, unit = unit))`: i In argument: `kat_floored = floor_date(kat_time, unit = unit)`.
Caused by error in `C_time_floor()`:
! CCTZ: Invalid timezone of the input vector: "Asia/Kathmandu"
Backtrace:
     ▆
  1. ├─tz_times %>% ... at test-dplyr-funcs-datetime.R:3552:2
  2. ├─arrow (local) check_timezone_rounding_vs_lubridate(., ".001 second")
  3. │ └─arrow:::compare_dplyr_binding(...) at test-dplyr-funcs-datetime.R:3471:2
  4. │   └─rlang::eval_tidy(expr, rlang::new_data_mask(rlang::env(.input = tbl))) at tests/testthat/helper-expectation.R:86:2
  5. ├─... %>% collect()
  6. ├─dplyr::collect(.)
  7. ├─dplyr::mutate(...)
  8. ├─dplyr:::mutate.data.frame(...)
  9. │ └─dplyr:::mutate_cols(.data, dplyr_quosures(...), by)
 10. │   ├─base::withCallingHandlers(...)
 11. │   └─dplyr:::mutate_col(dots[[i]], data, mask, new_columns)
 12. │     └─mask$eval_all_mutate(quo)
 13. │       └─dplyr (local) eval()
 14. ├─lubridate::floor_date(kat_time, unit = unit)
 15. │ └─timechange::time_floor(x, unit = unit, week_start = as_week_start(week_start))
 16. │   ├─timechange:::from_posixct(...)
 17. │   └─timechange:::C_time_floor(...)
 18. └─base::.handleSimpleError(...)
 19.   └─dplyr (local) h(simpleError(msg, call))
 20.     └─rlang::abort(message, class = error_class, parent = parent, call = error_call)
── Error ('test-dplyr-funcs-type.R:859:3'): format date/time ───────────────────
Error in `compute.arrow_dplyr_query(x)`: Invalid: Cannot find locale 'en_US.UTF-8': locale::facet::_S_create_c_locale name not valid
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc:1174  GetLocale(options.locale)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc:1181  Make(ctx, *in.type)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec.cc:857  kernel_->exec(kernel_ctx_, input, &output)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec/expression.cc:607  executor->Execute( ExecBatch(std::move(arguments), all_scalar ? 1 : input.length), &listener)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/compute/exec/project_node.cc:91  ExecuteScalarExpression(simplified_expr, target, plan()->query_context()->exec_context())
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/record_batch.cc:334  ReadNext(&batch)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_apache-arrow/apache-arrow/work/arrow-11.0.0/cpp/src/arrow/record_batch.cc:348  ToRecordBatches()
Backtrace:
     ▆
  1. ├─arrow:::compare_dplyr_binding(...) at test-dplyr-funcs-type.R:859:2
  2. │ ├─testthat::expect_warning(...) at tests/testthat/helper-expectation.R:94:2
  3. │ │ └─testthat:::expect_condition_matching(...)
  4. │ │   └─testthat:::quasi_capture(...)
  5. │ │     ├─testthat (local) .capture(...)
  6. │ │     │ └─base::withCallingHandlers(...)
  7. │ │     └─rlang::eval_bare(quo_get_expr(.quo), quo_get_env(.quo))
  8. │ └─rlang::eval_tidy(expr, rlang::new_data_mask(rlang::env(.input = arrow_table(tbl))))
  9. ├─... %>% collect()
 10. ├─dplyr::collect(.)
 11. └─arrow:::collect.arrow_dplyr_query(.)
 12.   └─arrow:::compute.arrow_dplyr_query(x)
 13.     └─base::tryCatch(...)
 14.       └─base (local) tryCatchList(expr, classes, parentenv, handlers)
 15.         └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
 16.           └─value[[3L]](cond)
 17.             └─arrow:::augment_io_error_msg(e, call, schema = schema())
 18.               └─rlang::abort(msg, call = call)
── Error ('test-s3-minio.R:18:1'): (code run outside of `test_that()`) ─────────
Error: invalid version specification '10.0.0d2'
Backtrace:
    ▆
 1. └─arrow:::skip_if_not_available("s3") at test-s3-minio.R:18:0
 2.   └─arrow:::on_macos_10_13_or_lower() at tests/testthat/helper-skip.R:44:4
 3.     └─base::package_version(unname(Sys.info()["release"]))
 4.       └─base::.make_numeric_version(...)
── Error ('test-s3.R:18:1'): (code run outside of `test_that()`) ───────────────
Error: invalid version specification '10.0.0d2'
Backtrace:
    ▆
 1. └─arrow:::skip_if_not_available("s3") at test-s3.R:18:0
 2.   └─arrow:::on_macos_10_13_or_lower() at tests/testthat/helper-skip.R:44:4
 3.     └─base::package_version(unname(Sys.info()["release"]))
 4.       └─base::.make_numeric_version(...)

[ FAIL 11 | WARN 16 | SKIP 111 | PASS 6586 ]
Error: Test failures
Execution halted

arrow_info():

> library("arrow")
Some features are not enabled in this build of Arrow. Run `arrow_info()` for more information.

Attaching package: ‘arrow’

The following object is masked from ‘package:utils’:

    timestamp

> arrow_info()
Arrow package version: 11.0.0.3

Capabilities:

dataset   FALSE
substrait FALSE
parquet   FALSE
json      FALSE
s3        FALSE
gcs       FALSE
utf8proc   TRUE
re2        TRUE
snappy     TRUE
gzip       TRUE
brotli     TRUE
zstd       TRUE
lz4        TRUE
lz4_frame  TRUE
lzo       FALSE
bz2        TRUE
jemalloc   TRUE
mimalloc  FALSE

Memory:

Allocator jemalloc
Current    0 bytes
Max        0 bytes

Runtime:

SIMD Level          none
Detected SIMD Level none

Build:

C++ Library Version  11.0.0
C++ Compiler            GNU
C++ Compiler Version 11.3.0

Component(s)

R

nealrichardson commented 1 year ago

cc @rok. I'm not sure how locale is handled in the C++ library but looks like something isn't happy on PPC?

barracuda156 commented 1 year ago

P. S. Not sure why R-arrow has these inactive:

parquet   FALSE
json      FALSE

arrow itself has been built with support for these. Complete config is here: https://github.com/barracuda156/macports-ports/blob/f34cdb879bd40e6d4b5247a19ab4cb630738a443/devel/apache-arrow/Portfile#L120-L153

rok commented 1 year ago

Hey. It seems like there's two problems here:

  1. Locale error is probably thrown here https://github.com/apache/arrow/blob/49631057e9cdbf991e11e0be4b9aa0dadf616850/cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc#L1157 Which seems like a ppc issue. Perhaps try export LC_ALL="C" to see what happens?
  2. Timezone not found (CCTZ: Invalid timezone of the input vector: "Asia/Kathmandu") which you might be able to resolve with instructions here: https://howardhinnant.github.io/date/tz.html#Installation
barracuda156 commented 1 year ago

Hey. It seems like there's two problems here:

  1. Locale error is probably thrown here https://github.com/apache/arrow/blob/49631057e9cdbf991e11e0be4b9aa0dadf616850/cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc#L1157

    Which seems like a ppc issue. Perhaps try export LC_ALL="C" to see what happens?

  2. Timezone not found (CCTZ: Invalid timezone of the input vector: "Asia/Kathmandu") which you might be able to resolve with instructions here: https://howardhinnant.github.io/date/tz.html#Installation

@rok Thank you, I will check these.

Could we also fix recognizing older macOS versions? To get rid of this:

Error: invalid version specification '10.0.0d2'
rok commented 1 year ago

@barracuda156 I'm not sure where invalid version specification is thrown. It looks like it's a generic R error (https://github.com/rstudio/reticulate/issues/204#issuecomment-377403623), @nealrichardson any ideas?

barracuda156 commented 1 year ago

@rok Well, here it is actually the OS version, and R itself recognizes it:

svacchanda$ r

R version 4.2.3 (2023-03-15) -- "Shortstop Beagle"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: powerpc-apple-darwin10.0.0d2 (32-bit)
> R.Version()
$platform
[1] "powerpc-apple-darwin10.0.0d2"

$arch
[1] "powerpc"

$os
[1] "darwin10.0.0d2"

$system
[1] "powerpc, darwin10.0.0d2"

$status
[1] ""

$major
[1] "4"

$minor
[1] "2.3"

$year
[1] "2023"

$month
[1] "03"

$day
[1] "15"

$`svn rev`
[1] "83980"

$language
[1] "R"

$version.string
[1] "R version 4.2.3 (2023-03-15)"

$nickname
[1] "Shortstop Beagle"
nealrichardson commented 1 year ago

The version thing is from a test helper we have in the R package, it's an easy remedy, not connected to the locale issues in C++

nealrichardson commented 1 year ago

The invalid version specification comes from here: https://github.com/apache/arrow/blob/main/r/R/arrow-package.R#L210 because it seems Sys.info()["release"] on your setup is not parseable as a numeric version. This should work around:

diff --git a/r/R/arrow-package.R b/r/R/arrow-package.R
index a3c860a51c..9fc89b46ea 100644
--- a/r/R/arrow-package.R
+++ b/r/R/arrow-package.R
@@ -207,7 +207,8 @@ on_linux_dev <- function() {

 on_macos_10_13_or_lower <- function() {
   identical(unname(Sys.info()["sysname"]), "Darwin") &&
-    package_version(unname(Sys.info()["release"])) < "18.0.0"
+    # wrap in isTRUE because package_version can return NA
+    isTRUE(package_version(unname(Sys.info()["release"]), strict = FALSE) < "18.0.0")
 }

 option_use_threads <- function() {