apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://arrow.apache.org/
Apache License 2.0
14.13k stars 3.44k forks source link

[C++] R Session Aborted, R encountered a fatal error after write_dataset command #34689

Open EmilySChoy opened 1 year ago

EmilySChoy commented 1 year ago

Describe the bug, including details regarding any error messages, version, and platform.

I am trying to save gps data from an online biologging database to a local arrow dataset using R studio with R version R.4.2.3. I get the warning "R Session Aborted, R encountered a fatal error. The session was terminated, after I run this line of code.

gps %>% filter( dep_id %in% dd, # Only dep_ids from our list dd deployed == 1 # only data collected on the bird ) %>% collect() %>% group_by(site, subsite, species, year, metal_band, dep_id) %>% arrow::write_dataset('raw_data/gps', format = "csv")

The problem seems to be with the write_dataset function from the arrow package, but I'm not sure why and what I could try to prevent R from Aborting.

Any troubleshooting advice would be greatly appreciated!

Thank you in advance!

Component(s)

R

thisisnic commented 1 year ago

Hi @EmilySChoy, thanks for reporting this!

I'm afraid that's not quite enough information for us to be able to help easily. If the dataset is publicly available, can you show us all of the code you ran to access it? Also, which version of Arrow are you using? You can get that info by running arrow::arrow_info().

It sounds like there is a segfault when you run the code, which causes R to abort. This makes it trickier to get error message out for reporting issues. One other possible thing to try is running the code from a session with the debugger attached, to get a more complete error message: https://arrow.apache.org/docs/r/articles/developers/debugging.html

Just to check; if you just run

gps %>%  filter( dep_id %in% dd, # Only dep_ids from our list dd deployed == 1 # only data collected on the bird ) %>%  collect()

does that work without any issues?

EmilySChoy commented 1 year ago

11.0.0.3

Hi @EmilySChoy, thanks for reporting this!

I'm afraid that's not quite enough information for us to be able to help easily. If the dataset is publicly available, can you show us all of the code you ran to access it? Also, which version of Arrow are you using? You can get that info by running arrow::arrow_info().

It sounds like there is a segfault when you run the code, which causes R to abort. This makes it trickier to get error message out for reporting issues. One other possible thing to try is running the code from a session with the debugger attached, to get a more complete error message: https://arrow.apache.org/docs/r/articles/developers/debugging.html

Just to check; if you just run

gps %>%  filter( dep_id %in% dd, # Only dep_ids from our list dd deployed == 1 # only data collected on the bird ) %>%  collect()

does that work without any issues?

Thank you @thisisnic

The data is on a private online AWS cloud for my lab. The arrow version I have is 11.0.0.3

The code you have provided works without any issues. I get: FileSystemDataset (query) time: timestamp[us, tz=UTC] lon: double lat: double altitude_m: int32 satellites: int32 hdop: double inrange: bool site: string subsite: string species: string year: int32 metal_band: int32 dep_id: string deployed: int32

thisisnic commented 1 year ago

Could I just get you to run that code with collect() at the end as well? I just want to pin down whether it's the opening or writing which is causing the issue.

EmilySChoy commented 1 year ago

Hi! I ran

gps %>% filter( dep_id %in% dd) %>% collect()

and got

# A tibble: 16,112 × 14
   time                  lon   lat altitude_m satellites  hdop inrange site   subsite
   <dttm>              <dbl> <dbl>      <int>      <int> <dbl> <lgl>   <chr>  <chr>  
 1 2019-05-18 18:18:57 -146.  59.4        -12          2   5.3 NA      Middl… Tower  
 2 2019-05-18 18:21:31 -146.  59.4          0          4   2.2 NA      Middl… Tower  
 3 2019-05-18 18:24:03 -146.  59.4         66          4   2.3 NA      Middl… Tower  
 4 2019-05-18 18:27:02 -146.  59.4         48          4   2.4 NA      Middl… Tower  
 5 2019-05-18 18:30:01 -146.  59.4        -72          4   2.4 NA      Middl… Tower  
 6 2019-05-18 18:33:02 -146.  59.4        -76          3  12.1 NA      Middl… Tower  
 7 2019-05-18 18:36:03 -146.  59.4        -14          4   3   NA      Middl… Tower  
 8 2019-05-18 18:39:03 -146.  59.4        114          4   3   NA      Middl… Tower  
 9 2019-05-18 18:42:00 -146.  59.4         64          4   2.4 NA      Middl… Tower  
10 2019-05-18 18:45:00 -146.  59.4        150          3   2.4 NA      Middl… Tower  
# … with 16,102 more rows, and 5 more variables: species <chr>, year <int>,
#   metal_band <int>, dep_id <chr>, deployed <int>
# ℹ Use `print(n = ...)` to see more rows, and `colnames()` to see all variable name
thisisnic commented 1 year ago

OK, great, that narrows it down to looking like it is the write_dataset() step where the error lies.

Would you mind giving me a bit more info by running arrow::arrow_info() and then sessionInfo() and showing me the output of both?

As something else to try, now the data is in your R session, if you only write a subset of rows or columns to disk, do you still have the same problem? If we can narrow it down to the problematic rows/columns, that would help. Alternative, start by just writing a single row/column, so we can rule out it being a more general issue. Let me know how you get on!

EmilySChoy commented 1 year ago

Thank you @thisisnic! Here are the outputs to the codes:

arrow::arrow_info()

Arrow package version: 11.0.0.3

Capabilities:

dataset TRUE substrait FALSE parquet TRUE json TRUE s3 TRUE gcs TRUE utf8proc TRUE re2 TRUE snappy TRUE gzip TRUE brotli TRUE zstd TRUE lz4 TRUE lz4_frame TRUE lzo FALSE bz2 TRUE jemalloc FALSE mimalloc TRUE

Arrow options():

arrow.use_threads FALSE

Memory:

Allocator mimalloc Current 1.69 Mb Max 1.79 Mb

Runtime:

SIMD Level avx2 Detected SIMD Level avx2

Build:

C++ Library Version 11.0.0 C++ Compiler GNU C++ Compiler Version 10.3.0 Git ID 58286965ec6974f700ff9fe3f7dcbe56095878d7

sessionInfo()

R version 4.2.3 (2023-03-15 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale: [1] LC_COLLATE=English_Canada.utf8 LC_CTYPE=English_Canada.utf8
[3] LC_MONETARY=English_Canada.utf8 LC_NUMERIC=C
[5] LC_TIME=English_Canada.utf8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] dplyr_1.0.10 config_0.3.1 arrow_11.0.0.3 RPostgres_1.4.5
[5] RPostgreSQL_0.7-5 DBI_1.1.3

loaded via a namespace (and not attached): [1] Rcpp_1.0.9 dbplyr_2.2.1 pillar_1.8.1 compiler_4.2.3
[5] seabiRds_0.1.0 tools_4.2.3 bit_4.0.4 lubridate_1.8.0 [9] lifecycle_1.0.2 tibble_3.1.8 lattice_0.20-45 pkgconfig_2.0.3 [13] rlang_1.0.5 cli_3.4.0 rstudioapi_0.14 yaml_2.3.5
[17] crawl_2.3.0 mvtnorm_1.1-3 terra_1.6-17 raster_3.6-3
[21] generics_0.1.3 vctrs_0.4.1 hms_1.1.2 bit64_4.0.5
[25] grid_4.2.3 tidyselect_1.1.2 glue_1.6.2 R6_2.5.1
[29] fansi_1.0.3 sp_1.5-0 tzdb_0.3.0 purrr_0.3.4
[33] blob_1.2.3 magrittr_2.0.3 codetools_0.2-19 ellipsis_0.3.2
[37] assertthat_0.2.1 utf8_1.2.2

I will try writing a subset of rows to disk as requested and will get back to you ASAP.

tdhock commented 1 year ago

hi, I'm getting the same issue when I run the first few lines of example("write_dataset",package="arrow") shown below

library(arrow)
sessionInfo()
one_level_tree <- tempfile()
write_dataset(mtcars, one_level_tree, partitioning = "cyl")

I am using Ubuntu 22.04 (jammy), with two versions of R/GCC, both of which segfault when running the R code above.

GCC 13.1 + R 4.3.0 both built from source below.

(base) tdhock@tdhock-MacBook:~$ bin/R --vanilla < R/arrow-crash.R

R version 4.3.0 (2023-04-21) -- "Already Tomorrow"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R est un logiciel libre livré sans AUCUNE GARANTIE.
Vous pouvez le redistribuer sous certaines conditions.
Tapez 'license()' ou 'licence()' pour plus de détails.

R est un projet collaboratif avec de nombreux contributeurs.
Tapez 'contributors()' pour plus d'information et
'citation()' pour la façon de le citer dans les publications.

Tapez 'demo()' pour des démonstrations, 'help()' pour l'aide
en ligne ou 'help.start()' pour obtenir l'aide au format HTML.
Tapez 'q()' pour quitter R.

> library(arrow)

Attachement du package : ‘arrow’

L'objet suivant est masqué depuis ‘package:utils’:

    timestamp

> sessionInfo()
R version 4.3.0 (2023-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
 [1] LC_CTYPE=fr_FR.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=fr_FR.UTF-8        LC_COLLATE=fr_FR.UTF-8    
 [5] LC_MONETARY=fr_FR.UTF-8    LC_MESSAGES=fr_FR.UTF-8   
 [7] LC_PAPER=fr_FR.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C       

time zone: America/Phoenix
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] arrow_11.0.0.3

loaded via a namespace (and not attached):
 [1] tidyselect_1.2.0 bit_4.0.5        compiler_4.3.0   magrittr_2.0.3  
 [5] assertthat_0.2.1 R6_2.5.1         cli_3.6.1        glue_1.6.2      
 [9] bit64_4.0.5      vctrs_0.6.2      lifecycle_1.0.3  rlang_1.1.1     
[13] purrr_1.0.1     
> one_level_tree <- tempfile()
> write_dataset(mtcars, one_level_tree, partitioning = "cyl")

 *** caught illegal operation ***
address 0x7f976687f2c7, cause 'illegal operand'

Traceback:
 1: ExecPlan_Write(self, node, prepare_key_value_metadata(node$final_metadata()),     ...)
 2: plan$Write(final_node, options, path_and_fs$fs, path_and_fs$path,     partitioning, basename_template, existing_data_behavior,     max_partitions, max_open_files, max_rows_per_file, min_rows_per_group,     max_rows_per_group)
 3: write_dataset(mtcars, one_level_tree, partitioning = "cyl")
An irrecoverable exception occurred. R is aborting now ...
Instruction non permise (core dumped)

GCC 11.3 + R 4.1.2 both provided by Ubuntu binary packages below,

(base) tdhock@tdhock-MacBook:~$ R --vanilla < R/arrow-crash.R

R version 4.1.2 (2021-11-01) -- "Bird Hippie"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R est un logiciel libre livré sans AUCUNE GARANTIE.
Vous pouvez le redistribuer sous certaines conditions.
Tapez 'license()' ou 'licence()' pour plus de détails.

R est un projet collaboratif avec de nombreux contributeurs.
Tapez 'contributors()' pour plus d'information et
'citation()' pour la façon de le citer dans les publications.

Tapez 'demo()' pour des démonstrations, 'help()' pour l'aide
en ligne ou 'help.start()' pour obtenir l'aide au format HTML.
Tapez 'q()' pour quitter R.

> library(arrow)

Attachement du package : ‘arrow’

L'objet suivant est masqué depuis ‘package:utils’:

    timestamp

> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
 [1] LC_CTYPE=fr_FR.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=fr_FR.UTF-8        LC_COLLATE=fr_FR.UTF-8    
 [5] LC_MONETARY=fr_FR.UTF-8    LC_MESSAGES=fr_FR.UTF-8   
 [7] LC_PAPER=fr_FR.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] arrow_11.0.0.3

loaded via a namespace (and not attached):
 [1] tidyselect_1.1.2 bit_4.0.4        compiler_4.1.2   magrittr_2.0.2  
 [5] assertthat_0.2.1 R6_2.5.1         cli_3.2.0        glue_1.6.1      
 [9] bit64_4.0.5      vctrs_0.3.8      rlang_1.0.1      purrr_0.3.4     
> one_level_tree <- tempfile()
> write_dataset(mtcars, one_level_tree, partitioning = "cyl")

 *** caught illegal operation ***
address 0x7fb67b3ea027, cause 'illegal operand'

Traceback:
 1: ExecPlan_Write(self, node, prepare_key_value_metadata(node$final_metadata()),     ...)
 2: plan$Write(final_node, options, path_and_fs$fs, path_and_fs$path,     partitioning, basename_template, existing_data_behavior,     max_partitions, max_open_files, max_rows_per_file, min_rows_per_group,     max_rows_per_group)
 3: write_dataset(mtcars, one_level_tree, partitioning = "cyl")
An irrecoverable exception occurred. R is aborting now ...
Instruction non permise (core dumped)

Running either through valgrind works fine (no segfault), see below.

(base) tdhock@tdhock-MacBook:~$ R -d valgrind --vanilla < R/arrow-crash.R
==97829== Memcheck, a memory error detector
==97829== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==97829== Using Valgrind-3.20.0 and LibVEX; rerun with -h for copyright info
==97829== Command: /usr/lib/R/bin/exec/R --vanilla
==97829== 

R version 4.1.2 (2021-11-01) -- "Bird Hippie"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R est un logiciel libre livré sans AUCUNE GARANTIE.
Vous pouvez le redistribuer sous certaines conditions.
Tapez 'license()' ou 'licence()' pour plus de détails.

R est un projet collaboratif avec de nombreux contributeurs.
Tapez 'contributors()' pour plus d'information et
'citation()' pour la façon de le citer dans les publications.

Tapez 'demo()' pour des démonstrations, 'help()' pour l'aide
en ligne ou 'help.start()' pour obtenir l'aide au format HTML.
Tapez 'q()' pour quitter R.

> library(arrow)

Attachement du package : ‘arrow’

L'objet suivant est masqué depuis ‘package:utils’:

    timestamp

> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
 [1] LC_CTYPE=fr_FR.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=fr_FR.UTF-8        LC_COLLATE=fr_FR.UTF-8    
 [5] LC_MONETARY=fr_FR.UTF-8    LC_MESSAGES=fr_FR.UTF-8   
 [7] LC_PAPER=fr_FR.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] arrow_11.0.0.3

loaded via a namespace (and not attached):
 [1] tidyselect_1.1.2 bit_4.0.4        compiler_4.1.2   magrittr_2.0.2  
 [5] assertthat_0.2.1 R6_2.5.1         cli_3.2.0        glue_1.6.1      
 [9] bit64_4.0.5      vctrs_0.3.8      rlang_1.0.1      purrr_0.3.4     
> one_level_tree <- tempfile()
> write_dataset(mtcars, one_level_tree, partitioning = "cyl")
> 
==97829== 
==97829== HEAP SUMMARY:
==97829==     in use at exit: 112,522,453 bytes in 23,761 blocks
==97829==   total heap usage: 135,614 allocs, 111,853 frees, 259,936,158 bytes allocated
==97829== 
==97829== LEAK SUMMARY:
==97829==    definitely lost: 0 bytes in 0 blocks
==97829==    indirectly lost: 0 bytes in 0 blocks
==97829==      possibly lost: 12,752 bytes in 8 blocks
==97829==    still reachable: 112,509,701 bytes in 23,753 blocks
==97829==                       of which reachable via heuristic:
==97829==                         newarray           : 4,264 bytes in 1 blocks
==97829==         suppressed: 0 bytes in 0 blocks
==97829== Rerun with --leak-check=full to see details of leaked memory
==97829== 
==97829== For lists of detected and suppressed errors, rerun with: -s
==97829== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

All of the above is for the current CRAN version arrow_11.0.0.3 but this also happens with arrow version 10, see below,

(base) tdhock@tdhock-MacBook:~$ R CMD INSTALL ~/Downloads/arrow_10.0.0.tar.gz 
Le chargement a nécessité le package : grDevices
* installing to library ‘/home/tdhock/R/x86_64-pc-linux-gnu-library/4.1’
* installing *source* package ‘arrow’ ...
** package ‘arrow’ successfully unpacked and MD5 sums checked
** using staged installation
Le chargement a nécessité le package : grDevices
*** Found libcurl and openssl >= 3.0.0
PKG_CFLAGS=-I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS
PKG_LIBS=-L/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/lib -larrow_dataset -lparquet -L/arrow/r/libarrow/dist/lib -larrow -larrow_bundled_dependencies -larrow -larrow_bundled_dependencies -larrow_dataset -lparquet -lssl -lcrypto -lcurl -lssl -lcrypto -lcurl
** libs
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c RTasks.cpp -o RTasks.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c altrep.cpp -o altrep.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c array.cpp -o array.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c array_to_vector.cpp -o array_to_vector.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c arraydata.cpp -o arraydata.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c arrowExports.cpp -o arrowExports.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c bridge.cpp -o bridge.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c buffer.cpp -o buffer.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c chunkedarray.cpp -o chunkedarray.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c compression.cpp -o compression.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c compute-exec.cpp -o compute-exec.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c compute.cpp -o compute.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c config.cpp -o config.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c csv.cpp -o csv.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c dataset.cpp -o dataset.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c datatype.cpp -o datatype.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c expression.cpp -o expression.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c extension-impl.cpp -o extension-impl.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c feather.cpp -o feather.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c field.cpp -o field.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c filesystem.cpp -o filesystem.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c imports.cpp -o imports.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c io.cpp -o io.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c json.cpp -o json.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c memorypool.cpp -o memorypool.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c message.cpp -o message.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c parquet.cpp -o parquet.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c r_to_arrow.cpp -o r_to_arrow.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c recordbatch.cpp -o recordbatch.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c recordbatchreader.cpp -o recordbatchreader.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c recordbatchwriter.cpp -o recordbatchwriter.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c safe-call-into-r-impl.cpp -o safe-call-into-r-impl.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c scalar.cpp -o scalar.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c schema.cpp -o schema.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c symbols.cpp -o symbols.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c table.cpp -o table.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c threadpool.cpp -o threadpool.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/usr/lib/R/site-library/cpp11/include'    -fpic  -g -O2 -ffile-prefix-map=/build/r-base-4A2Reg/r-base-4.1.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c type_infer.cpp -o type_infer.o
g++ -std=gnu++17 -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -flto=auto -ffat-lto-objects -flto=auto -Wl,-z,relro -o arrow.so RTasks.o altrep.o array.o array_to_vector.o arraydata.o arrowExports.o bridge.o buffer.o chunkedarray.o compression.o compute-exec.o compute.o config.o csv.o dataset.o datatype.o expression.o extension-impl.o feather.o field.o filesystem.o imports.o io.o json.o memorypool.o message.o parquet.o r_to_arrow.o recordbatch.o recordbatchreader.o recordbatchwriter.o safe-call-into-r-impl.o scalar.o schema.o symbols.o table.o threadpool.o type_infer.o -L/tmp/Rtmp3363P8/R.INSTALL184d46c388317/arrow/libarrow/arrow-10.0.0/lib -larrow_dataset -lparquet -L/arrow/r/libarrow/dist/lib -larrow -larrow_bundled_dependencies -larrow -larrow_bundled_dependencies -larrow_dataset -lparquet -lssl -lcrypto -lcurl -lssl -lcrypto -lcurl -L/usr/lib/R/lib -lR
installing to /home/tdhock/R/x86_64-pc-linux-gnu-library/4.1/00LOCK-arrow/00new/arrow/libs
** R
** inst
** byte-compile and prepare package for lazy loading
Le chargement a nécessité le package : grDevices
** help
*** installing help indices
** building package indices
Le chargement a nécessité le package : grDevices
** installing vignettes
** testing if installed package can be loaded from temporary location
Le chargement a nécessité le package : grDevices
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
Le chargement a nécessité le package : grDevices
** testing if installed package keeps a record of temporary installation path
* DONE (arrow)
(base) tdhock@tdhock-MacBook:~$ R --vanilla < R/arrow-crash.R

R version 4.1.2 (2021-11-01) -- "Bird Hippie"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R est un logiciel libre livré sans AUCUNE GARANTIE.
Vous pouvez le redistribuer sous certaines conditions.
Tapez 'license()' ou 'licence()' pour plus de détails.

R est un projet collaboratif avec de nombreux contributeurs.
Tapez 'contributors()' pour plus d'information et
'citation()' pour la façon de le citer dans les publications.

Tapez 'demo()' pour des démonstrations, 'help()' pour l'aide
en ligne ou 'help.start()' pour obtenir l'aide au format HTML.
Tapez 'q()' pour quitter R.

> library(arrow)

Attachement du package : ‘arrow’

L'objet suivant est masqué depuis ‘package:utils’:

    timestamp

> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
 [1] LC_CTYPE=fr_FR.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=fr_FR.UTF-8        LC_COLLATE=fr_FR.UTF-8    
 [5] LC_MONETARY=fr_FR.UTF-8    LC_MESSAGES=fr_FR.UTF-8   
 [7] LC_PAPER=fr_FR.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] arrow_10.0.0

loaded via a namespace (and not attached):
 [1] tidyselect_1.1.2 bit_4.0.4        compiler_4.1.2   magrittr_2.0.2  
 [5] assertthat_0.2.1 R6_2.5.1         cli_3.2.0        glue_1.6.1      
 [9] bit64_4.0.5      vctrs_0.3.8      rlang_1.0.1      purrr_0.3.4     
> one_level_tree <- tempfile()
> write_dataset(mtcars, one_level_tree, partitioning = "cyl")

 *** caught illegal operation ***
address 0x7f44b08be647, cause 'illegal operand'

Traceback:
 1: ExecPlan_Write(self, node, prepare_key_value_metadata(node$final_metadata()),     ...)
 2: plan$Write(final_node, options, path_and_fs$fs, path_and_fs$path,     partitioning, basename_template, existing_data_behavior,     max_partitions, max_open_files, max_rows_per_file, min_rows_per_group,     max_rows_per_group)
 3: write_dataset(mtcars, one_level_tree, partitioning = "cyl")
An irrecoverable exception occurred. R is aborting now ...
Instruction non permise (core dumped)

Same segfault if I install pre-compiled arrow binary packages from Rstudio via code below,

options(
  HTTPUserAgent =
    sprintf(
      "R/%s R (%s)",
      getRversion(),
      paste(getRversion(), R.version["platform"], R.version["arch"], R.version["os"])
    )
)
install.packages("arrow", repos = "https://packagemanager.rstudio.com/all/__linux__/jammy/latest")

Finally arrow_11.0.0 installed via conda does work on this machine, see below,

(arrow) tdhock@tdhock-MacBook:~$ R --vanilla < R/arrow-crash.R

R version 4.2.3 (2023-03-15) -- "Shortstop Beagle"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-conda-linux-gnu (64-bit)

R est un logiciel libre livré sans AUCUNE GARANTIE.
Vous pouvez le redistribuer sous certaines conditions.
Tapez 'license()' ou 'licence()' pour plus de détails.

R est un projet collaboratif avec de nombreux contributeurs.
Tapez 'contributors()' pour plus d'information et
'citation()' pour la façon de le citer dans les publications.

Tapez 'demo()' pour des démonstrations, 'help()' pour l'aide
en ligne ou 'help.start()' pour obtenir l'aide au format HTML.
Tapez 'q()' pour quitter R.

> library(arrow)

Attachement du package : ‘arrow’

L'objet suivant est masqué depuis ‘package:utils’:

    timestamp

> sessionInfo()
R version 4.2.3 (2023-03-15)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS/LAPACK: /home/tdhock/miniconda3/envs/arrow/lib/libopenblasp-r0.3.21.so

locale:
 [1] LC_CTYPE=fr_FR.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=fr_FR.UTF-8        LC_COLLATE=fr_FR.UTF-8    
 [5] LC_MONETARY=fr_FR.UTF-8    LC_MESSAGES=fr_FR.UTF-8   
 [7] LC_PAPER=fr_FR.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] arrow_11.0.0

loaded via a namespace (and not attached):
 [1] tidyselect_1.2.0 bit_4.0.5        compiler_4.2.3   magrittr_2.0.3  
 [5] assertthat_0.2.1 R6_2.5.1         cli_3.6.1        glue_1.6.2      
 [9] bit64_4.0.5      vctrs_0.6.2      lifecycle_1.0.3  rlang_1.1.1     
[13] purrr_1.0.1     
> one_level_tree <- tempfile()
> write_dataset(mtcars, one_level_tree, partitioning = "cyl")
> 

ldd output below, in case that helps

(base) tdhock@tdhock-MacBook:~$ ldd /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/arrow.so 
    linux-vdso.so.1 (0x00007ffdb99f0000)
    libarrow_substrait.so.1100 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../libarrow_substrait.so.1100 (0x00007f8ce9667000)
    libarrow_dataset.so.1100 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../libarrow_dataset.so.1100 (0x00007f8ce94f7000)
    libparquet.so.1100 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../libparquet.so.1100 (0x00007f8ce91d0000)
    libarrow.so.1100 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../libarrow.so.1100 (0x00007f8ce7a02000)
    libR.so => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../lib/libR.so (0x00007f8ce752e000)
    libstdc++.so.6 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../libstdc++.so.6 (0x00007f8ce7378000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f8ce727c000)
    libgcc_s.so.1 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../libgcc_s.so.1 (0x00007f8ce7263000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f8ce703b000)
    libprotobuf.so.32 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././libprotobuf.so.32 (0x00007f8ce6d87000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f8ce9a7c000)
    libthrift.so.0.18.1 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././libthrift.so.0.18.1 (0x00007f8ce6cde000)
    libcrypto.so.3 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././libcrypto.so.3 (0x00007f8ce67d5000)
    libbrotlienc.so.1 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././libbrotlienc.so.1 (0x00007f8ce6737000)
    libbrotlidec.so.1 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././libbrotlidec.so.1 (0x00007f8ce6729000)
    liborc.so => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././liborc.so (0x00007f8ce65e5000)
    libglog.so.1 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././libglog.so.1 (0x00007f8ce65a8000)
    libutf8proc.so.2 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././libutf8proc.so.2 (0x00007f8ce6550000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f8ce654b000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f8ce6546000)
    libbz2.so.1.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././libbz2.so.1.0 (0x00007f8ce6532000)
    liblz4.so.1 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././liblz4.so.1 (0x00007f8ce6507000)
    libsnappy.so.1 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././libsnappy.so.1 (0x00007f8ce64f8000)
    libz.so.1 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././libz.so.1 (0x00007f8ce64de000)
    libzstd.so.1 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././libzstd.so.1 (0x00007f8ce641b000)
    libgoogle_cloud_cpp_storage.so.2 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././libgoogle_cloud_cpp_storage.so.2 (0x00007f8ce6177000)
    libaws-cpp-sdk-identity-management.so => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././libaws-cpp-sdk-identity-management.so (0x00007f8ce614b000)
    libaws-cpp-sdk-s3.so => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././libaws-cpp-sdk-s3.so (0x00007f8ce5e73000)
    libaws-cpp-sdk-core.so => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././libaws-cpp-sdk-core.so (0x00007f8ce5d1c000)
    libre2.so.10 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././libre2.so.10 (0x00007f8ce5cb7000)
    libgoogle_cloud_cpp_common.so.2 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././libgoogle_cloud_cpp_common.so.2 (0x00007f8ce5c4f000)
    libabsl_time.so.2301.0.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././libabsl_time.so.2301.0.0 (0x00007f8ce5c38000)
    libabsl_time_zone.so.2301.0.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././libabsl_time_zone.so.2301.0.0 (0x00007f8ce5c17000)
    libaws-crt-cpp.so => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././libaws-crt-cpp.so (0x00007f8ce5b89000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f8ce5b84000)
    libblas.so.3 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../lib/../../libblas.so.3 (0x00007f8ce38dc000)
    libreadline.so.8 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../lib/../../libreadline.so.8 (0x00007f8ce3883000)
    libpcre2-8.so.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../lib/../../libpcre2-8.so.0 (0x00007f8ce37e1000)
    liblzma.so.5 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../lib/../../liblzma.so.5 (0x00007f8ce37b8000)
    libiconv.so.2 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../lib/../../libiconv.so.2 (0x00007f8ce36d1000)
    libicuuc.so.72 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../lib/../../libicuuc.so.72 (0x00007f8ce34ca000)
    libicui18n.so.72 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../lib/../../libicui18n.so.72 (0x00007f8ce319a000)
    libgomp.so.1 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../lib/../../libgomp.so.1 (0x00007f8ce315f000)
    libssl.so.3 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libssl.so.3 (0x00007f8ce30bd000)
    libbrotlicommon.so.1 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libbrotlicommon.so.1 (0x00007f8ce309a000)
    libgflags.so.2.2 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libgflags.so.2.2 (0x00007f8ce3075000)
    libgoogle_cloud_cpp_rest_internal.so.2 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libgoogle_cloud_cpp_rest_internal.so.2 (0x00007f8ce2fa5000)
    libcrc32c.so.1 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libcrc32c.so.1 (0x00007f8ce2f9f000)
    libcurl.so.4 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libcurl.so.4 (0x00007f8ce2ef3000)
    libabsl_crc32c.so.2301.0.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libabsl_crc32c.so.2301.0.0 (0x00007f8ce2eed000)
    libabsl_str_format_internal.so.2301.0.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libabsl_str_format_internal.so.2301.0.0 (0x00007f8ce2ed2000)
    libabsl_strings.so.2301.0.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libabsl_strings.so.2301.0.0 (0x00007f8ce2eb0000)
    libabsl_strings_internal.so.2301.0.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libabsl_strings_internal.so.2301.0.0 (0x00007f8ce2eaa000)
    libaws-cpp-sdk-cognito-identity.so => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libaws-cpp-sdk-cognito-identity.so (0x00007f8ce2e08000)
    libaws-cpp-sdk-sts.so => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libaws-cpp-sdk-sts.so (0x00007f8ce2dbb000)
    libaws-c-event-stream.so.1.0.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libaws-c-event-stream.so.1.0.0 (0x00007f8ce2da2000)
    libaws-checksums.so.1.0.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libaws-checksums.so.1.0.0 (0x00007f8ce2d92000)
    libaws-c-common.so.1 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libaws-c-common.so.1 (0x00007f8ce2d55000)
    libabsl_int128.so.2301.0.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libabsl_int128.so.2301.0.0 (0x00007f8ce2d4e000)
    libabsl_base.so.2301.0.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libabsl_base.so.2301.0.0 (0x00007f8ce2d48000)
    libabsl_raw_logging_internal.so.2301.0.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libabsl_raw_logging_internal.so.2301.0.0 (0x00007f8ce2d43000)
    libaws-c-mqtt.so.1.0.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libaws-c-mqtt.so.1.0.0 (0x00007f8ce2cff000)
    libaws-c-s3.so.0unstable => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libaws-c-s3.so.0unstable (0x00007f8ce2cd9000)
    libaws-c-auth.so.1.0.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libaws-c-auth.so.1.0.0 (0x00007f8ce2ca9000)
    libaws-c-http.so.1.0.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libaws-c-http.so.1.0.0 (0x00007f8ce2c45000)
    libaws-c-io.so.1.0.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libaws-c-io.so.1.0.0 (0x00007f8ce2bfe000)
    libaws-c-cal.so.1.0.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libaws-c-cal.so.1.0.0 (0x00007f8ce2bea000)
    libaws-c-sdkutils.so.1.0.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././libaws-c-sdkutils.so.1.0.0 (0x00007f8ce2bd1000)
    libgfortran.so.5 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../lib/../.././libgfortran.so.5 (0x00007f8ce2a26000)
    libtinfo.so.6 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../lib/../.././libtinfo.so.6 (0x00007f8ce29e6000)
    libicudata.so.72 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../lib/../.././libicudata.so.72 (0x00007f8ce0c15000)
    libnghttp2.so.14 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././././libnghttp2.so.14 (0x00007f8ce0be6000)
    libssh2.so.1 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././././libssh2.so.1 (0x00007f8ce0ba2000)
    libgssapi_krb5.so.2 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././././libgssapi_krb5.so.2 (0x00007f8ce0b50000)
    libabsl_crc_internal.so.2301.0.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././././libabsl_crc_internal.so.2301.0.0 (0x00007f8ce0b49000)
    libabsl_spinlock_wait.so.2301.0.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././././libabsl_spinlock_wait.so.2301.0.0 (0x00007f8ce0b42000)
    libaws-c-compression.so.1.0.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././././libaws-c-compression.so.1.0.0 (0x00007f8ce0b3d000)
    libs2n.so.1 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../.././././libs2n.so.1 (0x00007f8ce09fc000)
    libquadmath.so.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../lib/../../././libquadmath.so.0 (0x00007f8ce09c2000)
    libkrb5.so.3 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././././libkrb5.so.3 (0x00007f8ce08e9000)
    libk5crypto.so.3 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././././libk5crypto.so.3 (0x00007f8ce08d0000)
    libcom_err.so.3 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././././libcom_err.so.3 (0x00007f8ce08ca000)
    libkrb5support.so.0 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././././libkrb5support.so.0 (0x00007f8ce08bb000)
    libkeyutils.so.1 => /home/tdhock/miniconda3/envs/arrow/lib/R/library/arrow/libs/../../../../././././libkeyutils.so.1 (0x00007f8ce08b4000)
    libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007f8ce089e000)

(base) tdhock@tdhock-MacBook:~$ ldd /home/tdhock/R/x86_64-pc-linux-gnu-library/4.1/arrow/libs/arrow.so 
    linux-vdso.so.1 (0x00007ffe711fb000)
    libcurl.so.4 => /lib/x86_64-linux-gnu/libcurl.so.4 (0x00007fa9ce0c7000)
    libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3 (0x00007fa9cdc84000)
    libR.so => /lib/libR.so (0x00007fa9cd7cb000)
    libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fa9cd5a1000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fa9cd4ba000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fa9cd49a000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa9cd270000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fa9d0ad1000)
    libnghttp2.so.14 => /lib/x86_64-linux-gnu/libnghttp2.so.14 (0x00007fa9cd246000)
    libidn2.so.0 => /lib/x86_64-linux-gnu/libidn2.so.0 (0x00007fa9cd225000)
    librtmp.so.1 => /lib/x86_64-linux-gnu/librtmp.so.1 (0x00007fa9cd206000)
    libssh.so.4 => /lib/x86_64-linux-gnu/libssh.so.4 (0x00007fa9cd199000)
    libpsl.so.5 => /lib/x86_64-linux-gnu/libpsl.so.5 (0x00007fa9cd185000)
    libssl.so.3 => /lib/x86_64-linux-gnu/libssl.so.3 (0x00007fa9cd0df000)
    libgssapi_krb5.so.2 => /lib/x86_64-linux-gnu/libgssapi_krb5.so.2 (0x00007fa9cd08b000)
    libldap-2.5.so.0 => /lib/x86_64-linux-gnu/libldap-2.5.so.0 (0x00007fa9cd02c000)
    liblber-2.5.so.0 => /lib/x86_64-linux-gnu/liblber-2.5.so.0 (0x00007fa9cd01b000)
    libzstd.so.1 => /lib/x86_64-linux-gnu/libzstd.so.1 (0x00007fa9ccf4c000)
    libbrotlidec.so.1 => /lib/x86_64-linux-gnu/libbrotlidec.so.1 (0x00007fa9ccf3e000)
    libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fa9ccf20000)
    libblas.so.3 => /lib/x86_64-linux-gnu/libblas.so.3 (0x00007fa9cce7a000)
    libreadline.so.8 => /lib/x86_64-linux-gnu/libreadline.so.8 (0x00007fa9cce26000)
    libpcre2-8.so.0 => /lib/x86_64-linux-gnu/libpcre2-8.so.0 (0x00007fa9ccd8f000)
    liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007fa9ccd64000)
    libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007fa9ccd51000)
    libtirpc.so.3 => /lib/x86_64-linux-gnu/libtirpc.so.3 (0x00007fa9ccd21000)
    libicuuc.so.70 => /lib/x86_64-linux-gnu/libicuuc.so.70 (0x00007fa9ccb26000)
    libicui18n.so.70 => /lib/x86_64-linux-gnu/libicui18n.so.70 (0x00007fa9cc7f7000)
    libgomp.so.1 => /lib/x86_64-linux-gnu/libgomp.so.1 (0x00007fa9cc7ad000)
    libunistring.so.2 => /lib/x86_64-linux-gnu/libunistring.so.2 (0x00007fa9cc603000)
    libgnutls.so.30 => /lib/x86_64-linux-gnu/libgnutls.so.30 (0x00007fa9cc416000)
    libhogweed.so.6 => /lib/x86_64-linux-gnu/libhogweed.so.6 (0x00007fa9cc3ce000)
    libnettle.so.8 => /lib/x86_64-linux-gnu/libnettle.so.8 (0x00007fa9cc388000)
    libgmp.so.10 => /lib/x86_64-linux-gnu/libgmp.so.10 (0x00007fa9cc306000)
    libkrb5.so.3 => /lib/x86_64-linux-gnu/libkrb5.so.3 (0x00007fa9cc23b000)
    libk5crypto.so.3 => /lib/x86_64-linux-gnu/libk5crypto.so.3 (0x00007fa9cc20a000)
    libcom_err.so.2 => /lib/x86_64-linux-gnu/libcom_err.so.2 (0x00007fa9cc204000)
    libkrb5support.so.0 => /lib/x86_64-linux-gnu/libkrb5support.so.0 (0x00007fa9cc1f6000)
    libsasl2.so.2 => /lib/x86_64-linux-gnu/libsasl2.so.2 (0x00007fa9cc1db000)
    libbrotlicommon.so.1 => /lib/x86_64-linux-gnu/libbrotlicommon.so.1 (0x00007fa9cc1b8000)
    libtinfo.so.6 => /lib/x86_64-linux-gnu/libtinfo.so.6 (0x00007fa9cc186000)
    libicudata.so.70 => /lib/x86_64-linux-gnu/libicudata.so.70 (0x00007fa9ca566000)
    libp11-kit.so.0 => /lib/x86_64-linux-gnu/libp11-kit.so.0 (0x00007fa9ca42b000)
    libtasn1.so.6 => /lib/x86_64-linux-gnu/libtasn1.so.6 (0x00007fa9ca413000)
    libkeyutils.so.1 => /lib/x86_64-linux-gnu/libkeyutils.so.1 (0x00007fa9ca40c000)
    libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007fa9ca3f8000)
    libffi.so.8 => /lib/x86_64-linux-gnu/libffi.so.8 (0x00007fa9ca3e9000)
thisisnic commented 1 year ago

Thanks for the detailed output there, @tdhock. Would you mind trying again with a nightly build and seeing if the problem persists? There've been a few changes since the last release which may or may not have fixed the issue you're experiencing. It's possible this could be a similar issue as seen in #34211

Instructions for installing nightly builds are here: https://arrow.apache.org/docs/r/articles/install_nightly.html

tdhock commented 1 year ago

hi @thisisnic thanks for your quick response. I ran example("write_parquet",package="arrow") as in #34211, using current CRAN release arrow_11.0.0.3 and it works fine on my system (no segfault). I also installed nightly build 11.0.0.100000321 and that does not fix the issue, write_dataset still gives segfault.

westonpace commented 1 year ago

address 0x7f44b08be647, cause 'illegal operand' typically means that its trying to use a vectorized instruction that is not available on the machine. We've encountered issues of this sort on mac before.

I know it is a pain but if there is anyway you could produce a core dump and give it to us that will help us narrow down which instruction it's trying to call. Alternatively, if you could run in gdb and intercept the crash you can figure out which operation is being called that way too, but that might be trickier than generating a core dump. I often do this by putting my R code in a script and then typing...

$ R -d gdb
...
(gdb) run
...
> source("/tmp/script.R")
...
Thread 1 "R" received signal...
69  ../sysdeps/unix/sysv/linux/select.c: No such file or directory.
(gdb) disassemble
Dump of assembler code for function __GI___select:
   0x00007ffff771b690 <+0>: endbr64 

Do you get the same error writing a different format (not CSV)?

tdhock commented 1 year ago

Hi does this help?

(base) tdhock@maude-MacBookPro:~/projects/max-generalized-auc(master)$ R -d gdb
GNU gdb (Ubuntu 10.2-0ubuntu1~18.04~2) 10.2
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/tdhock/lib/R/bin/exec/R...
(gdb) run
Starting program: /home/tdhock/lib/R/bin/exec/R 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

R version 4.3.0 (2023-04-21) -- "Already Tomorrow"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

Loading required package: grDevices
[Detaching after fork from child process 6431]
[Detaching after fork from child process 6433]
> example("write_dataset",package="arrow")
[New Thread 0x7fffe6aa3700 (LWP 6435)]
[New Thread 0x7fffdc7ff700 (LWP 6436)]

Attaching package: ‘arrow’

The following object is masked from ‘package:utils’:

    timestamp

wrt_dt> ## Don't show: 
wrt_dt> if (arrow_with_dataset() & arrow_with_parquet() & requireNamespace("dplyr", quietly = TRUE)) (if (getRversion() >= "3.4") withAutoprint else force)({ # examplesIf
wrt_dt+ ## End(Don't show)
wrt_dt+ # You can write datasets partitioned by the values in a column (here: "cyl").
wrt_dt+ # This creates a structure of the form cyl=X/part-Z.parquet.
wrt_dt+ one_level_tree <- tempfile()
wrt_dt+ write_dataset(mtcars, one_level_tree, partitioning = "cyl")
wrt_dt+ list.files(one_level_tree, recursive = TRUE)
wrt_dt+ 
wrt_dt+ # You can also partition by the values in multiple columns
wrt_dt+ # (here: "cyl" and "gear").
wrt_dt+ # This creates a structure of the form cyl=X/gear=Y/part-Z.parquet.
wrt_dt+ two_levels_tree <- tempfile()
wrt_dt+ write_dataset(mtcars, two_levels_tree, partitioning = c("cyl", "gear"))
wrt_dt+ list.files(two_levels_tree, recursive = TRUE)
wrt_dt+ 
wrt_dt+ # In the two previous examples we would have:
wrt_dt+ # X = {4,6,8}, the number of cylinders.
wrt_dt+ # Y = {3,4,5}, the number of forward gears.
wrt_dt+ # Z = {0,1,2}, the number of saved parts, starting from 0.
wrt_dt+ 
wrt_dt+ # You can obtain the same result as as the previous examples using arrow with
wrt_dt+ # a dplyr pipeline. This will be the same as two_levels_tree above, but the
wrt_dt+ # output directory will be different.
wrt_dt+ library(dplyr)
wrt_dt+ two_levels_tree_2 <- tempfile()
wrt_dt+ mtcars %>%
wrt_dt+   group_by(cyl, gear) %>%
wrt_dt+   write_dataset(two_levels_tree_2)
wrt_dt+ list.files(two_levels_tree_2, recursive = TRUE)
wrt_dt+ 
wrt_dt+ # And you can also turn off the Hive-style directory naming where the column
wrt_dt+ # name is included with the values by using `hive_style = FALSE`.
wrt_dt+ 
wrt_dt+ # Write a structure X/Y/part-Z.parquet.
wrt_dt+ two_levels_tree_no_hive <- tempfile()
wrt_dt+ mtcars %>%
wrt_dt+   group_by(cyl, gear) %>%
wrt_dt+   write_dataset(two_levels_tree_no_hive, hive_style = FALSE)
wrt_dt+ list.files(two_levels_tree_no_hive, recursive = TRUE)
wrt_dt+ ## Don't show: 
wrt_dt+ }) # examplesIf
> one_level_tree <- tempfile()
> write_dataset(mtcars, one_level_tree, partitioning = "cyl")
[New Thread 0x7fffdb789700 (LWP 6437)]
[New Thread 0x7fffdaf88700 (LWP 6438)]
[New Thread 0x7fffda787700 (LWP 6439)]
[New Thread 0x7fffd9f86700 (LWP 6440)]

Thread 5 "R" received signal SIGILL, Illegal instruction.
[Switching to Thread 0x7fffdaf88700 (LWP 6438)]
0x00007fffe3a45367 in arrow::compute::RowTableMetadata::FromColumnMetadataVector(std::vector<arrow::compute::KeyColumnMetadata, std::allocator<arrow::compute::KeyColumnMetadata> > const&, int, int) ()
   from /home/tdhock/lib/R/library/arrow/libs/arrow.so
(gdb) disassemble
Dump of assembler code for function _ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii:
   0x00007fffe3a45120 <+0>: push   %r15
   0x00007fffe3a45122 <+2>: mov    %rsi,%r15
   0x00007fffe3a45125 <+5>: push   %r14
   0x00007fffe3a45127 <+7>: push   %r13
   0x00007fffe3a45129 <+9>: push   %r12
   0x00007fffe3a4512b <+11>:    push   %rbp
   0x00007fffe3a4512c <+12>:    push   %rbx
   0x00007fffe3a4512d <+13>:    mov    %rdi,%rbx
   0x00007fffe3a45130 <+16>:    sub    $0x18,%rsp
   0x00007fffe3a45134 <+20>:    mov    0x8(%rsi),%rax
   0x00007fffe3a45138 <+24>:    mov    0x20(%rbx),%r8
   0x00007fffe3a4513c <+28>:    mov    %ecx,0x4(%rsp)
   0x00007fffe3a45140 <+32>:    mov    (%rsi),%rcx
   0x00007fffe3a45143 <+35>:    mov    0x18(%rdi),%rdi
   0x00007fffe3a45147 <+39>:    mov    %edx,(%rsp)
   0x00007fffe3a4514a <+42>:    mov    %r8,%rdx
   0x00007fffe3a4514d <+45>:    sub    %rcx,%rax
   0x00007fffe3a45150 <+48>:    mov    %rax,%rsi
   0x00007fffe3a45153 <+51>:    sub    %rdi,%rdx
   0x00007fffe3a45156 <+54>:    sar    $0x3,%rsi
   0x00007fffe3a4515a <+58>:    sar    $0x3,%rdx
--Type <RET> for more, q to quit, c to continue without paging--c
   0x00007fffe3a4515e <+62>:    cmp    %rdx,%rsi
   0x00007fffe3a45161 <+65>:    ja     0x7fffe3a45508 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+1000>
   0x00007fffe3a45167 <+71>:    mov    %rsi,%r14
   0x00007fffe3a4516a <+74>:    jae    0x7fffe3a45189 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+105>
   0x00007fffe3a4516c <+76>:    add    %rdi,%rax
   0x00007fffe3a4516f <+79>:    cmp    %rax,%r8
   0x00007fffe3a45172 <+82>:    je     0x7fffe3a45189 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+105>
   0x00007fffe3a45174 <+84>:    mov    %rax,0x20(%rbx)
   0x00007fffe3a45178 <+88>:    mov    (%r15),%rcx
   0x00007fffe3a4517b <+91>:    mov    0x8(%r15),%rsi
   0x00007fffe3a4517f <+95>:    sub    %rcx,%rsi
   0x00007fffe3a45182 <+98>:    sar    $0x3,%rsi
   0x00007fffe3a45186 <+102>:   mov    %rsi,%r14
   0x00007fffe3a45189 <+105>:   mov    0x38(%rbx),%rbp
   0x00007fffe3a4518d <+109>:   mov    0x30(%rbx),%r8
   0x00007fffe3a45191 <+113>:   mov    %rbp,%r10
   0x00007fffe3a45194 <+116>:   sub    %r8,%r10
   0x00007fffe3a45197 <+119>:   sar    $0x2,%r10
   0x00007fffe3a4519b <+123>:   test   %r14,%r14
   0x00007fffe3a4519e <+126>:   je     0x7fffe3a45490 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+880>
   0x00007fffe3a451a4 <+132>:   mov    0x18(%rbx),%rdi
   0x00007fffe3a451a8 <+136>:   xor    %eax,%eax
   0x00007fffe3a451aa <+138>:   nopw   0x0(%rax,%rax,1)
   0x00007fffe3a451b0 <+144>:   mov    (%rcx,%rax,8),%rdx
   0x00007fffe3a451b4 <+148>:   mov    %rdx,(%rdi,%rax,8)
   0x00007fffe3a451b8 <+152>:   add    $0x1,%rax
   0x00007fffe3a451bc <+156>:   cmp    %r14,%rax
   0x00007fffe3a451bf <+159>:   jne    0x7fffe3a451b0 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+144>
   0x00007fffe3a451c1 <+161>:   mov    %esi,%r14d
   0x00007fffe3a451c4 <+164>:   mov    %esi,%r13d
   0x00007fffe3a451c7 <+167>:   cmp    %r10,%r14
   0x00007fffe3a451ca <+170>:   ja     0x7fffe3a45520 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+1024>
   0x00007fffe3a451d0 <+176>:   jb     0x7fffe3a454a0 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+896>
   0x00007fffe3a451d6 <+182>:   test   %r13d,%r13d
   0x00007fffe3a451d9 <+185>:   je     0x7fffe3a4523e <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+286>
   0x00007fffe3a451db <+187>:   lea    -0x1(%r13),%eax
   0x00007fffe3a451df <+191>:   cmp    $0x2,%eax
   0x00007fffe3a451e2 <+194>:   jbe    0x7fffe3a4553c <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+1052>
   0x00007fffe3a451e8 <+200>:   mov    %r13d,%edx
   0x00007fffe3a451eb <+203>:   movdqa 0xff39cd(%rip),%xmm0        # 0x7fffe4a38bc0
   0x00007fffe3a451f3 <+211>:   movdqa 0xff39d5(%rip),%xmm1        # 0x7fffe4a38bd0
   0x00007fffe3a451fb <+219>:   mov    %r8,%rax
   0x00007fffe3a451fe <+222>:   shr    $0x2,%edx
   0x00007fffe3a45201 <+225>:   sub    $0x1,%edx
   0x00007fffe3a45204 <+228>:   shl    $0x4,%rdx
   0x00007fffe3a45208 <+232>:   lea    0x10(%r8,%rdx,1),%rdx
   0x00007fffe3a4520d <+237>:   nopl   (%rax)
   0x00007fffe3a45210 <+240>:   movups %xmm0,(%rax)
   0x00007fffe3a45213 <+243>:   add    $0x10,%rax
   0x00007fffe3a45217 <+247>:   paddd  %xmm1,%xmm0
   0x00007fffe3a4521b <+251>:   cmp    %rax,%rdx
   0x00007fffe3a4521e <+254>:   jne    0x7fffe3a45210 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+240>
   0x00007fffe3a45220 <+256>:   mov    %r13d,%eax
   0x00007fffe3a45223 <+259>:   and    $0xfffffffc,%eax
   0x00007fffe3a45226 <+262>:   cmp    %eax,%r13d
   0x00007fffe3a45229 <+265>:   je     0x7fffe3a4523e <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+286>
   0x00007fffe3a4522b <+267>:   nopl   0x0(%rax,%rax,1)
   0x00007fffe3a45230 <+272>:   mov    %eax,%edx
   0x00007fffe3a45232 <+274>:   mov    %eax,(%r8,%rdx,4)
   0x00007fffe3a45236 <+278>:   add    $0x1,%eax
   0x00007fffe3a45239 <+281>:   cmp    %eax,%r13d
   0x00007fffe3a4523c <+284>:   ja     0x7fffe3a45230 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+272>
   0x00007fffe3a4523e <+286>:   cmp    %r8,%rbp
   0x00007fffe3a45241 <+289>:   je     0x7fffe3a452b4 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+404>
   0x00007fffe3a45243 <+291>:   mov    %rbp,%r12
   0x00007fffe3a45246 <+294>:   mov    $0x3f,%edx
   0x00007fffe3a4524b <+299>:   mov    %r8,%rdi
   0x00007fffe3a4524e <+302>:   mov    %r15,%rcx
   0x00007fffe3a45251 <+305>:   sub    %r8,%r12
   0x00007fffe3a45254 <+308>:   mov    %rbp,%rsi
   0x00007fffe3a45257 <+311>:   mov    %r8,0x8(%rsp)
   0x00007fffe3a4525c <+316>:   mov    %r12,%rax
   0x00007fffe3a4525f <+319>:   sar    $0x2,%rax
   0x00007fffe3a45263 <+323>:   bsr    %rax,%rax
   0x00007fffe3a45267 <+327>:   xor    $0x3f,%rax
   0x00007fffe3a4526b <+331>:   cltq   
   0x00007fffe3a4526d <+333>:   sub    %rax,%rdx
   0x00007fffe3a45270 <+336>:   add    %rdx,%rdx
   0x00007fffe3a45273 <+339>:   call   0x7fffe3a43cd0 <_ZSt16__introsort_loopIN9__gnu_cxx17__normal_iteratorIPjSt6vectorIjSaIjEEEElNS0_5__ops15_Iter_comp_iterIZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKS3_INSA_17KeyColumnMetadataESaISC_EEiiEUljjE_EEEvT_SJ_T0_T1_>
   0x00007fffe3a45278 <+344>:   cmp    $0x40,%r12
   0x00007fffe3a4527c <+348>:   mov    0x8(%rsp),%r8
   0x00007fffe3a45281 <+353>:   jle    0x7fffe3a454c0 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+928>
   0x00007fffe3a45287 <+359>:   lea    0x40(%r8),%r12
   0x00007fffe3a4528b <+363>:   mov    %r15,%rdx
   0x00007fffe3a4528e <+366>:   mov    %r8,%rdi
   0x00007fffe3a45291 <+369>:   mov    %r12,%rsi
   0x00007fffe3a45294 <+372>:   call   0x7fffe3a43850 <_ZSt16__insertion_sortIN9__gnu_cxx17__normal_iteratorIPjSt6vectorIjSaIjEEEENS0_5__ops15_Iter_comp_iterIZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKS3_INSA_17KeyColumnMetadataESaISC_EEiiEUljjE_EEEvT_SJ_T0_>
   0x00007fffe3a45299 <+377>:   cmp    %rbp,%r12
   0x00007fffe3a4529c <+380>:   je     0x7fffe3a452b4 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+404>
   0x00007fffe3a4529e <+382>:   xchg   %ax,%ax
   0x00007fffe3a452a0 <+384>:   mov    %r12,%rdi
   0x00007fffe3a452a3 <+387>:   mov    %r15,%rsi
   0x00007fffe3a452a6 <+390>:   add    $0x4,%r12
   0x00007fffe3a452aa <+394>:   call   0x7fffe3a43760 <_ZSt25__unguarded_linear_insertIN9__gnu_cxx17__normal_iteratorIPjSt6vectorIjSaIjEEEENS0_5__ops14_Val_comp_iterIZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKS3_INSA_17KeyColumnMetadataESaISC_EEiiEUljjE_EEEvT_T0_>
   0x00007fffe3a452af <+399>:   cmp    %r12,%rbp
   0x00007fffe3a452b2 <+402>:   jne    0x7fffe3a452a0 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+384>
   0x00007fffe3a452b4 <+404>:   mov    0x50(%rbx),%rdx
   0x00007fffe3a452b8 <+408>:   mov    0x48(%rbx),%rcx
   0x00007fffe3a452bc <+412>:   mov    %rdx,%rax
   0x00007fffe3a452bf <+415>:   sub    %rcx,%rax
   0x00007fffe3a452c2 <+418>:   sar    $0x2,%rax
   0x00007fffe3a452c6 <+422>:   cmp    %rax,%r14
   0x00007fffe3a452c9 <+425>:   ja     0x7fffe3a454f0 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+976>
   0x00007fffe3a452cf <+431>:   jb     0x7fffe3a45470 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+848>
   0x00007fffe3a452d5 <+437>:   test   %r13d,%r13d
   0x00007fffe3a452d8 <+440>:   je     0x7fffe3a45302 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+482>
   0x00007fffe3a452da <+442>:   mov    0x30(%rbx),%rdi
   0x00007fffe3a452de <+446>:   mov    0x48(%rbx),%rsi
   0x00007fffe3a452e2 <+450>:   lea    -0x1(%r13),%ecx
   0x00007fffe3a452e6 <+454>:   xor    %eax,%eax
   0x00007fffe3a452e8 <+456>:   jmp    0x7fffe3a452f3 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+467>
   0x00007fffe3a452ea <+458>:   nopw   0x0(%rax,%rax,1)
   0x00007fffe3a452f0 <+464>:   mov    %rdx,%rax
   0x00007fffe3a452f3 <+467>:   mov    (%rdi,%rax,4),%edx
   0x00007fffe3a452f6 <+470>:   mov    %eax,(%rsi,%rdx,4)
   0x00007fffe3a452f9 <+473>:   lea    0x1(%rax),%rdx
   0x00007fffe3a452fd <+477>:   cmp    %rax,%rcx
   0x00007fffe3a45300 <+480>:   jne    0x7fffe3a452f0 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+464>
   0x00007fffe3a45302 <+482>:   mov    (%rsp),%eax
   0x00007fffe3a45305 <+485>:   mov    0x68(%rbx),%rdx
   0x00007fffe3a45309 <+489>:   movl   $0x0,0x8(%rbx)
   0x00007fffe3a45310 <+496>:   mov    0x60(%rbx),%rcx
   0x00007fffe3a45314 <+500>:   mov    %eax,0x10(%rbx)
   0x00007fffe3a45317 <+503>:   mov    0x4(%rsp),%eax
   0x00007fffe3a4531b <+507>:   mov    %eax,0x14(%rbx)
   0x00007fffe3a4531e <+510>:   mov    %rdx,%rax
   0x00007fffe3a45321 <+513>:   sub    %rcx,%rax
   0x00007fffe3a45324 <+516>:   sar    $0x2,%rax
   0x00007fffe3a45328 <+520>:   cmp    %rax,%r14
   0x00007fffe3a4532b <+523>:   ja     0x7fffe3a454d8 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+952>
   0x00007fffe3a45331 <+529>:   jb     0x7fffe3a45430 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+784>
   0x00007fffe3a45337 <+535>:   test   %r13d,%r13d
   0x00007fffe3a4533a <+538>:   je     0x7fffe3a4544a <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+810>
   0x00007fffe3a45340 <+544>:   lea    -0x1(%r13),%eax
   0x00007fffe3a45344 <+548>:   mov    0x30(%rbx),%r11
   0x00007fffe3a45348 <+552>:   mov    (%r15),%rbp
   0x00007fffe3a4534b <+555>:   xor    %edx,%edx
   0x00007fffe3a4534d <+557>:   lea    0x4(,%rax,4),%r10
   0x00007fffe3a45355 <+565>:   mov    0x60(%rbx),%r9
   0x00007fffe3a45359 <+569>:   xor    %eax,%eax
   0x00007fffe3a4535b <+571>:   xor    %r8d,%r8d
   0x00007fffe3a4535e <+574>:   jmp    0x7fffe3a4539b <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+635>
   0x00007fffe3a45360 <+576>:   mov    0x4(%rcx),%esi
   0x00007fffe3a45363 <+579>:   test   %esi,%esi
   0x00007fffe3a45365 <+581>:   je     0x7fffe3a45382 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+610>
=> 0x00007fffe3a45367 <+583>:   popcnt %rsi,%rsi
   0x00007fffe3a4536c <+588>:   cmp    $0x1,%esi
   0x00007fffe3a4536f <+591>:   je     0x7fffe3a45382 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+610>
   0x00007fffe3a45371 <+593>:   mov    0x14(%rbx),%esi
   0x00007fffe3a45374 <+596>:   mov    %eax,%r12d
   0x00007fffe3a45377 <+599>:   neg    %r12d
   0x00007fffe3a4537a <+602>:   sub    $0x1,%esi
   0x00007fffe3a4537d <+605>:   and    %r12d,%esi
   0x00007fffe3a45380 <+608>:   add    %esi,%eax
   0x00007fffe3a45382 <+610>:   mov    %eax,(%rdi)
   0x00007fffe3a45384 <+612>:   mov    0x4(%rcx),%ecx
   0x00007fffe3a45387 <+615>:   lea    (%rax,%rcx,1),%esi
   0x00007fffe3a4538a <+618>:   add    $0x1,%eax
   0x00007fffe3a4538d <+621>:   test   %ecx,%ecx
   0x00007fffe3a4538f <+623>:   cmovne %esi,%eax
   0x00007fffe3a45392 <+626>:   add    $0x4,%rdx
   0x00007fffe3a45396 <+630>:   cmp    %rdx,%r10
   0x00007fffe3a45399 <+633>:   je     0x7fffe3a453c7 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+679>
   0x00007fffe3a4539b <+635>:   mov    (%r11,%rdx,1),%ecx
   0x00007fffe3a4539f <+639>:   lea    (%r9,%rdx,1),%rdi
   0x00007fffe3a453a3 <+643>:   lea    0x0(%rbp,%rcx,8),%rcx
   0x00007fffe3a453a8 <+648>:   cmpb   $0x0,(%rcx)
   0x00007fffe3a453ab <+651>:   jne    0x7fffe3a45360 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+576>
   0x00007fffe3a453ad <+653>:   mov    %eax,(%rdi)
   0x00007fffe3a453af <+655>:   test   %r8d,%r8d
   0x00007fffe3a453b2 <+658>:   jne    0x7fffe3a453b7 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+663>
   0x00007fffe3a453b4 <+660>:   mov    %eax,0x8(%rbx)
   0x00007fffe3a453b7 <+663>:   add    $0x4,%rdx
   0x00007fffe3a453bb <+667>:   add    $0x1,%r8d
   0x00007fffe3a453bf <+671>:   add    $0x4,%eax
   0x00007fffe3a453c2 <+674>:   cmp    %rdx,%r10
   0x00007fffe3a453c5 <+677>:   jne    0x7fffe3a4539b <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+635>
   0x00007fffe3a453c7 <+679>:   test   %r8d,%r8d
   0x00007fffe3a453ca <+682>:   mov    %eax,%ecx
   0x00007fffe3a453cc <+684>:   sete   (%rbx)
   0x00007fffe3a453cf <+687>:   neg    %ecx
   0x00007fffe3a453d1 <+689>:   test   %r8d,%r8d
   0x00007fffe3a453d4 <+692>:   je     0x7fffe3a45420 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+768>
   0x00007fffe3a453d6 <+694>:   mov    0x14(%rbx),%edi
   0x00007fffe3a453d9 <+697>:   lea    -0x1(%rdi),%edx
   0x00007fffe3a453dc <+700>:   and    %ecx,%edx
   0x00007fffe3a453de <+702>:   add    %edx,%eax
   0x00007fffe3a453e0 <+704>:   mov    %eax,0x4(%rbx)
   0x00007fffe3a453e3 <+707>:   movl   $0x1,0xc(%rbx)
   0x00007fffe3a453ea <+714>:   cmp    $0x8,%r13d
   0x00007fffe3a453ee <+718>:   jbe    0x7fffe3a45407 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+743>
   0x00007fffe3a453f0 <+720>:   mov    $0x1,%eax
   0x00007fffe3a453f5 <+725>:   nopl   (%rax)
   0x00007fffe3a453f8 <+728>:   mov    %eax,%edx
   0x00007fffe3a453fa <+730>:   add    %eax,%eax
   0x00007fffe3a453fc <+732>:   shl    $0x4,%edx
   0x00007fffe3a453ff <+735>:   cmp    %r13d,%edx
   0x00007fffe3a45402 <+738>:   jb     0x7fffe3a453f8 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+728>
   0x00007fffe3a45404 <+740>:   mov    %eax,0xc(%rbx)
   0x00007fffe3a45407 <+743>:   add    $0x18,%rsp
   0x00007fffe3a4540b <+747>:   pop    %rbx
   0x00007fffe3a4540c <+748>:   pop    %rbp
   0x00007fffe3a4540d <+749>:   pop    %r12
   0x00007fffe3a4540f <+751>:   pop    %r13
   0x00007fffe3a45411 <+753>:   pop    %r14
   0x00007fffe3a45413 <+755>:   pop    %r15
   0x00007fffe3a45415 <+757>:   ret    
   0x00007fffe3a45416 <+758>:   nopw   %cs:0x0(%rax,%rax,1)
   0x00007fffe3a45420 <+768>:   mov    0x10(%rbx),%edi
   0x00007fffe3a45423 <+771>:   lea    -0x1(%rdi),%edx
   0x00007fffe3a45426 <+774>:   and    %ecx,%edx
   0x00007fffe3a45428 <+776>:   add    %edx,%eax
   0x00007fffe3a4542a <+778>:   jmp    0x7fffe3a453e0 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+704>
   0x00007fffe3a4542c <+780>:   nopl   0x0(%rax)
   0x00007fffe3a45430 <+784>:   lea    (%rcx,%r14,4),%rax
   0x00007fffe3a45434 <+788>:   cmp    %rax,%rdx
   0x00007fffe3a45437 <+791>:   je     0x7fffe3a45337 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+535>
   0x00007fffe3a4543d <+797>:   mov    %rax,0x68(%rbx)
   0x00007fffe3a45441 <+801>:   test   %r13d,%r13d
   0x00007fffe3a45444 <+804>:   jne    0x7fffe3a45340 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+544>
   0x00007fffe3a4544a <+810>:   movb   $0x1,(%rbx)
   0x00007fffe3a4544d <+813>:   movl   $0x0,0x4(%rbx)
   0x00007fffe3a45454 <+820>:   movl   $0x1,0xc(%rbx)
   0x00007fffe3a4545b <+827>:   add    $0x18,%rsp
   0x00007fffe3a4545f <+831>:   pop    %rbx
   0x00007fffe3a45460 <+832>:   pop    %rbp
   0x00007fffe3a45461 <+833>:   pop    %r12
   0x00007fffe3a45463 <+835>:   pop    %r13
   0x00007fffe3a45465 <+837>:   pop    %r14
   0x00007fffe3a45467 <+839>:   pop    %r15
   0x00007fffe3a45469 <+841>:   ret    
   0x00007fffe3a4546a <+842>:   nopw   0x0(%rax,%rax,1)
   0x00007fffe3a45470 <+848>:   lea    (%rcx,%r14,4),%rax
   0x00007fffe3a45474 <+852>:   cmp    %rax,%rdx
   0x00007fffe3a45477 <+855>:   je     0x7fffe3a452d5 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+437>
   0x00007fffe3a4547d <+861>:   mov    %rax,0x50(%rbx)
   0x00007fffe3a45481 <+865>:   jmp    0x7fffe3a452d5 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+437>
   0x00007fffe3a45486 <+870>:   nopw   %cs:0x0(%rax,%rax,1)
   0x00007fffe3a45490 <+880>:   xor    %r13d,%r13d
   0x00007fffe3a45493 <+883>:   test   %r10,%r10
   0x00007fffe3a45496 <+886>:   je     0x7fffe3a4523e <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+286>
   0x00007fffe3a4549c <+892>:   nopl   0x0(%rax)
   0x00007fffe3a454a0 <+896>:   lea    (%r8,%r14,4),%rax
   0x00007fffe3a454a4 <+900>:   cmp    %rbp,%rax
   0x00007fffe3a454a7 <+903>:   je     0x7fffe3a451d6 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+182>
   0x00007fffe3a454ad <+909>:   mov    %rax,0x38(%rbx)
   0x00007fffe3a454b1 <+913>:   mov    %rax,%rbp
   0x00007fffe3a454b4 <+916>:   jmp    0x7fffe3a451d6 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+182>
   0x00007fffe3a454b9 <+921>:   nopl   0x0(%rax)
   0x00007fffe3a454c0 <+928>:   mov    %r15,%rdx
   0x00007fffe3a454c3 <+931>:   mov    %rbp,%rsi
   0x00007fffe3a454c6 <+934>:   mov    %r8,%rdi
   0x00007fffe3a454c9 <+937>:   call   0x7fffe3a43850 <_ZSt16__insertion_sortIN9__gnu_cxx17__normal_iteratorIPjSt6vectorIjSaIjEEEENS0_5__ops15_Iter_comp_iterIZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKS3_INSA_17KeyColumnMetadataESaISC_EEiiEUljjE_EEEvT_SJ_T0_>
   0x00007fffe3a454ce <+942>:   jmp    0x7fffe3a452b4 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+404>
   0x00007fffe3a454d3 <+947>:   nopl   0x0(%rax,%rax,1)
   0x00007fffe3a454d8 <+952>:   mov    %r14,%rsi
   0x00007fffe3a454db <+955>:   lea    0x60(%rbx),%rdi
   0x00007fffe3a454df <+959>:   sub    %rax,%rsi
   0x00007fffe3a454e2 <+962>:   call   0x7fffe30d2060 <_ZNSt6vectorIjSaIjEE17_M_default_appendEm@plt>
   0x00007fffe3a454e7 <+967>:   jmp    0x7fffe3a45337 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+535>
   0x00007fffe3a454ec <+972>:   nopl   0x0(%rax)
   0x00007fffe3a454f0 <+976>:   mov    %r14,%rsi
   0x00007fffe3a454f3 <+979>:   lea    0x48(%rbx),%rdi
   0x00007fffe3a454f7 <+983>:   sub    %rax,%rsi
   0x00007fffe3a454fa <+986>:   call   0x7fffe30d2060 <_ZNSt6vectorIjSaIjEE17_M_default_appendEm@plt>
   0x00007fffe3a454ff <+991>:   jmp    0x7fffe3a452d5 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+437>
   0x00007fffe3a45504 <+996>:   nopl   0x0(%rax)
   0x00007fffe3a45508 <+1000>:  sub    %rdx,%rsi
   0x00007fffe3a4550b <+1003>:  lea    0x18(%rbx),%rdi
   0x00007fffe3a4550f <+1007>:  call   0x7fffe30d1570 <_ZNSt6vectorIN5arrow7compute17KeyColumnMetadataESaIS2_EE17_M_default_appendEm@plt>
   0x00007fffe3a45514 <+1012>:  jmp    0x7fffe3a45178 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+88>
   0x00007fffe3a45519 <+1017>:  nopl   0x0(%rax)
   0x00007fffe3a45520 <+1024>:  mov    %r14,%rsi
   0x00007fffe3a45523 <+1027>:  lea    0x30(%rbx),%rdi
   0x00007fffe3a45527 <+1031>:  sub    %r10,%rsi
   0x00007fffe3a4552a <+1034>:  call   0x7fffe30d2060 <_ZNSt6vectorIjSaIjEE17_M_default_appendEm@plt>
   0x00007fffe3a4552f <+1039>:  mov    0x30(%rbx),%r8
   0x00007fffe3a45533 <+1043>:  mov    0x38(%rbx),%rbp
   0x00007fffe3a45537 <+1047>:  jmp    0x7fffe3a451d6 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+182>
   0x00007fffe3a4553c <+1052>:  xor    %eax,%eax
   0x00007fffe3a4553e <+1054>:  jmp    0x7fffe3a45230 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+272>
End of assembler dump.
(gdb) q
A debugging session is active.

    Inferior 1 [process 6427] will be killed.

Quit anyway? (y or n) y
(base) tdhock@maude-MacBookPro:~/projects/max-generalized-auc(master*)$ 
tdhock commented 1 year ago
(base) tdhock@maude-MacBookPro:~/projects/max-generalized-auc(master*)$ cat /proc/cpuinfo 
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 23
model name  : Intel(R) Core(TM)2 Duo CPU     P8600  @ 2.40GHz
stepping    : 10
microcode   : 0xa0b
cpu MHz     : 2057.877
cache size  : 3072 KB
physical id : 0
siblings    : 2
core id     : 0
cpu cores   : 2
apicid      : 0
initial apicid  : 0
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm pti tpr_shadow vnmi flexpriority dtherm
bugs        : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown
bogomips    : 4778.54
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model       : 23
model name  : Intel(R) Core(TM)2 Duo CPU     P8600  @ 2.40GHz
stepping    : 10
microcode   : 0xa0b
cpu MHz     : 2118.835
cache size  : 3072 KB
physical id : 0
siblings    : 2
core id     : 1
cpu cores   : 2
apicid      : 1
initial apicid  : 1
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm pti tpr_shadow vnmi flexpriority dtherm
bugs        : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown
bogomips    : 4778.54
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

(base) tdhock@maude-MacBookPro:~/projects/max-generalized-auc(master*)$ 
tdhock commented 1 year ago

write_parquet example works

(base) tdhock@maude-MacBookPro:~/projects/max-generalized-auc(master*)$ R --vanilla -e 'example("write_parquet",package="arrow")'

R version 4.3.0 (2023-04-21) -- "Already Tomorrow"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> example("write_parquet",package="arrow")

Attaching package: ‘arrow’

The following object is masked from ‘package:utils’:

    timestamp

wrt_pr> ## Don't show: 
wrt_pr> if (arrow_with_parquet()) (if (getRversion() >= "3.4") withAutoprint else force)({ # examplesIf
wrt_pr+ ## End(Don't show)
wrt_pr+ tf1 <- tempfile(fileext = ".parquet")
wrt_pr+ write_parquet(data.frame(x = 1:5), tf1)
wrt_pr+ 
wrt_pr+ # using compression
wrt_pr+ if (codec_is_available("gzip")) {
wrt_pr+   tf2 <- tempfile(fileext = ".gz.parquet")
wrt_pr+   write_parquet(data.frame(x = 1:5), tf2, compression = "gzip", compression_level = 5)
wrt_pr+ }
wrt_pr+ ## Don't show: 
wrt_pr+ }) # examplesIf
> tf1 <- tempfile(fileext = ".parquet")
> write_parquet(data.frame(x = 1:5), tf1)
> if (codec_is_available("gzip")) {
+     tf2 <- tempfile(fileext = ".gz.parquet")
+     write_parquet(data.frame(x = 1:5), tf2, compression = "gzip", compression_level = 5)
+ }

wrt_pr> ## End(Don't show)
wrt_pr> 
wrt_pr> 
wrt_pr> 
> 
> 
(base) tdhock@maude-MacBookPro:~/projects/max-generalized-auc(master*)$ 
westonpace commented 1 year ago

This helps a lot. The failing instruction is popcnt. Unfortunately, we've run into trouble with this function in the past. I think it may be that Arrow currently requires support for the popcnt instruction. This can lead to issues with old machines. There is some more information here: https://github.com/apache/arrow/issues/21840

I'm a little surprised that this would occur in R/Linux. I thought that R/Linux always built Arrow from source (and the compiler should have realized and avoided the popcnt instruction). Do you know how r-arrow is getting installed? Is it getting a prebuilt binary from somewhere or building a new library?

tdhock commented 1 year ago

hi again, I built arrow from source using G++ 10.1

> arrow::install_arrow(nightly = TRUE)
trying URL 'https://nightlies.apache.org/arrow/r/src/contrib/arrow_12.0.0.100000037.tar.gz'
Content type 'application/x-gzip' length 3940046 bytes (3.8 MB)
==================================================
downloaded 3.8 MB

Loading required package: grDevices
* installing *source* package ‘arrow’ ...
** using staged installation
Loading required package: grDevices
*** Found libcurl and OpenSSL >= 1.1
PKG_CFLAGS=-D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS
PKG_LIBS=-L/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/lib -L/usr/lib/lib/x86_64-linux-gnu -larrow_acero -larrow_dataset -lparquet -larrow -pthread -larrow_bundled_dependencies -lcurl -lssl -lcrypto  
** libs
using C++ compiler: ‘g++ (GCC) 10.1.0’
using C++17
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c RTasks.cpp -o RTasks.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c altrep.cpp -o altrep.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c array.cpp -o array.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c array_to_vector.cpp -o array_to_vector.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c arraydata.cpp -o arraydata.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c arrowExports.cpp -o arrowExports.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c bridge.cpp -o bridge.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c buffer.cpp -o buffer.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c chunkedarray.cpp -o chunkedarray.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c compression.cpp -o compression.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c compute-exec.cpp -o compute-exec.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c compute.cpp -o compute.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c config.cpp -o config.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c csv.cpp -o csv.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c dataset.cpp -o dataset.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c datatype.cpp -o datatype.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c expression.cpp -o expression.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c extension-impl.cpp -o extension-impl.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c feather.cpp -o feather.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c field.cpp -o field.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c filesystem.cpp -o filesystem.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c io.cpp -o io.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c json.cpp -o json.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c memorypool.cpp -o memorypool.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c message.cpp -o message.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c parquet.cpp -o parquet.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c r_to_arrow.cpp -o r_to_arrow.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c recordbatch.cpp -o recordbatch.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c recordbatchreader.cpp -o recordbatchreader.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c recordbatchwriter.cpp -o recordbatchwriter.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c safe-call-into-r-impl.cpp -o safe-call-into-r-impl.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c scalar.cpp -o scalar.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c schema.cpp -o schema.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c symbols.cpp -o symbols.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c table.cpp -o table.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c threadpool.cpp -o threadpool.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -D_GLIBCXX_USE_CXX11_ABI=0 -DARROW_STATIC -I/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -I/usr/local/include    -fpic  -g -O2  -c type_infer.cpp -o type_infer.o
g++ -std=gnu++17 -shared -L/home/tdhock/lib/R/lib -L/usr/local/lib -o arrow.so RTasks.o altrep.o array.o array_to_vector.o arraydata.o arrowExports.o bridge.o buffer.o chunkedarray.o compression.o compute-exec.o compute.o config.o csv.o dataset.o datatype.o expression.o extension-impl.o feather.o field.o filesystem.o io.o json.o memorypool.o message.o parquet.o r_to_arrow.o recordbatch.o recordbatchreader.o recordbatchwriter.o safe-call-into-r-impl.o scalar.o schema.o symbols.o table.o threadpool.o type_infer.o -L/tmp/RtmpCDfZFY/R.INSTALL2f685c2c2332/arrow/libarrow/arrow-12.0.0.100000037/lib -L/usr/lib/lib/x86_64-linux-gnu -larrow_acero -larrow_dataset -lparquet -larrow -pthread -larrow_bundled_dependencies -lcurl -lssl -lcrypto -L/home/tdhock/lib/R/lib -lR
installing to /home/tdhock/lib/R/library/00LOCK-arrow/00new/arrow/libs
** R
** inst
** byte-compile and prepare package for lazy loading
Loading required package: grDevices
** help
*** installing help indices
** building package indices
Loading required package: grDevices
** testing if installed package can be loaded from temporary location
Loading required package: grDevices
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
Loading required package: grDevices
** testing if installed package keeps a record of temporary installation path
* DONE (arrow)
tdhock commented 1 year ago

since this is a known issue, and it seems from the other thread that people do not want to support this older hardware, I would expect a more informative error message, like "Fatal error: arrow::write_dataset() does not support older CPU architectures that do not have the POPCNT instruction" that would save users like me time troubleshooting, is that possible please? https://github.com/apache/arrow/issues/21840#issuecomment-1377771846 Also, in the other thread, there seems to be several mentions of a patch that would fix this issue, why has that not been merged?

tdhock commented 1 year ago

"the compiler should have realized and avoided the popcnt instruction" -> if this is the case, can you please create a minimal reproducible example program and send it to GCC as an issue they should fix? This is the list of open GCC bugs with keyword popcnt but I do not see any mention of this particular issue, do you? https://gcc.gnu.org/bugzilla/buglist.cgi?bug_status=__open__&content=popcnt&no_redirect=1&order=Importance&query_format=specific

tdhock commented 1 year ago

I tried again on another similar machine and I thought I may be able to fix the segfault by telling GCC to compile for my Core 2 processor, by putting CPPFLAGS=-march=core2 (docs https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html) in my ~/.R/Makevars file, but I get the same segfault on popcnt instruction, which seems to suggest that this may be a bug in GCC, what do you think? (I told it to use core2 which does not support popcnt, but it generated the popcnt instruction anyway).

(base) tdhock@tdhock-MacBook:/tmp/Rtmp8icqQo/downloaded_packages$ ARROW_R_DEV=true R CMD INSTALL arrow_12.0.0.100000037.tar.gz 
Le chargement a nécessité le package : grDevices
* installing to library ‘/home/tdhock/lib/R/library’
* installing *source* package ‘arrow’ ...
** using staged installation
Le chargement a nécessité le package : grDevices
*** Found libcurl and OpenSSL >= 3.0.0
essai de l'URL 'https://nightlies.apache.org/arrow/r/libarrow/bin/linux-openssl-3.0/arrow-12.0.0.100000037.zip'
Content type 'application/zip' length 39699427 bytes (37.9 MB)
==================================================
downloaded 37.9 MB

*** Successfully retrieved C++ binaries (linux-openssl-3.0)
PKG_CFLAGS=-DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS
PKG_LIBS=-L/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/lib -L/usr/lib/lib/x86_64-linux-gnu -larrow_acero -larrow_dataset -lparquet -larrow -larrow_bundled_dependencies -lcurl -lssl -lcrypto  
** libs
using C++ compiler: ‘g++ (GCC) 13.1.0’
using C++17
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c RTasks.cpp -o RTasks.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c altrep.cpp -o altrep.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c array.cpp -o array.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c array_to_vector.cpp -o array_to_vector.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c arraydata.cpp -o arraydata.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c arrowExports.cpp -o arrowExports.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c bridge.cpp -o bridge.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c buffer.cpp -o buffer.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c chunkedarray.cpp -o chunkedarray.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c compression.cpp -o compression.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c compute-exec.cpp -o compute-exec.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c compute.cpp -o compute.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c config.cpp -o config.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c csv.cpp -o csv.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c dataset.cpp -o dataset.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c datatype.cpp -o datatype.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c expression.cpp -o expression.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c extension-impl.cpp -o extension-impl.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c feather.cpp -o feather.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c field.cpp -o field.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c filesystem.cpp -o filesystem.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c io.cpp -o io.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c json.cpp -o json.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c memorypool.cpp -o memorypool.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c message.cpp -o message.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c parquet.cpp -o parquet.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c r_to_arrow.cpp -o r_to_arrow.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c recordbatch.cpp -o recordbatch.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c recordbatchreader.cpp -o recordbatchreader.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c recordbatchwriter.cpp -o recordbatchwriter.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c safe-call-into-r-impl.cpp -o safe-call-into-r-impl.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c scalar.cpp -o scalar.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c schema.cpp -o schema.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c symbols.cpp -o symbols.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c table.cpp -o table.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c threadpool.cpp -o threadpool.o
g++ -std=gnu++17 -I"/home/tdhock/lib/R/include" -DNDEBUG -DARROW_STATIC -I/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/include -I/usr/lib/include/x86_64-linux-gnu -I/usr/lib/include  -DARROW_R_WITH_PARQUET -DARROW_R_WITH_DATASET -DARROW_R_WITH_ACERO -DARROW_R_WITH_JSON -DARROW_R_WITH_S3 -DARROW_R_WITH_GCS -I'/home/tdhock/lib/R/library/cpp11/include' -march=core2    -fpic  -g -O2  -c type_infer.cpp -o type_infer.o
g++ -std=gnu++17 -shared -L/home/tdhock/lib/R/lib -L/usr/local/lib -o arrow.so RTasks.o altrep.o array.o array_to_vector.o arraydata.o arrowExports.o bridge.o buffer.o chunkedarray.o compression.o compute-exec.o compute.o config.o csv.o dataset.o datatype.o expression.o extension-impl.o feather.o field.o filesystem.o io.o json.o memorypool.o message.o parquet.o r_to_arrow.o recordbatch.o recordbatchreader.o recordbatchwriter.o safe-call-into-r-impl.o scalar.o schema.o symbols.o table.o threadpool.o type_infer.o -L/tmp/RtmpKjpWUw/R.INSTALL7256197605d7/arrow/libarrow/arrow-12.0.0.100000037/lib -L/usr/lib/lib/x86_64-linux-gnu -larrow_acero -larrow_dataset -lparquet -larrow -larrow_bundled_dependencies -lcurl -lssl -lcrypto -L/home/tdhock/lib/R/lib -lR
installing to /home/tdhock/lib/R/library/00LOCK-arrow/00new/arrow/libs
** R
** inst
** byte-compile and prepare package for lazy loading
Le chargement a nécessité le package : grDevices
** help
*** installing help indices
** building package indices
Le chargement a nécessité le package : grDevices
** testing if installed package can be loaded from temporary location
Le chargement a nécessité le package : grDevices
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
Le chargement a nécessité le package : grDevices
** testing if installed package keeps a record of temporary installation path
* DONE (arrow)
(base) tdhock@tdhock-MacBook:/tmp/Rtmp8icqQo/downloaded_packages$ R --vanilla -e 'example("write_dataset",package="arrow")'

R version 4.3.0 (2023-04-21) -- "Already Tomorrow"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R est un logiciel libre livré sans AUCUNE GARANTIE.
Vous pouvez le redistribuer sous certaines conditions.
Tapez 'license()' ou 'licence()' pour plus de détails.

R est un projet collaboratif avec de nombreux contributeurs.
Tapez 'contributors()' pour plus d'information et
'citation()' pour la façon de le citer dans les publications.

Tapez 'demo()' pour des démonstrations, 'help()' pour l'aide
en ligne ou 'help.start()' pour obtenir l'aide au format HTML.
Tapez 'q()' pour quitter R.

> example("write_dataset",package="arrow")

Attachement du package : ‘arrow’

L'objet suivant est masqué depuis ‘package:utils’:

    timestamp

wrt_dt> ## Don't show: 
wrt_dt> if (arrow_with_dataset() & arrow_with_parquet() & requireNamespace("dplyr", quietly = TRUE)) (if (getRversion() >= "3.4") withAutoprint else force)({ # examplesIf
wrt_dt+ ## End(Don't show)
wrt_dt+ # You can write datasets partitioned by the values in a column (here: "cyl").
wrt_dt+ # This creates a structure of the form cyl=X/part-Z.parquet.
wrt_dt+ one_level_tree <- tempfile()
wrt_dt+ write_dataset(mtcars, one_level_tree, partitioning = "cyl")
wrt_dt+ list.files(one_level_tree, recursive = TRUE)
wrt_dt+ 
wrt_dt+ # You can also partition by the values in multiple columns
wrt_dt+ # (here: "cyl" and "gear").
wrt_dt+ # This creates a structure of the form cyl=X/gear=Y/part-Z.parquet.
wrt_dt+ two_levels_tree <- tempfile()
wrt_dt+ write_dataset(mtcars, two_levels_tree, partitioning = c("cyl", "gear"))
wrt_dt+ list.files(two_levels_tree, recursive = TRUE)
wrt_dt+ 
wrt_dt+ # In the two previous examples we would have:
wrt_dt+ # X = {4,6,8}, the number of cylinders.
wrt_dt+ # Y = {3,4,5}, the number of forward gears.
wrt_dt+ # Z = {0,1,2}, the number of saved parts, starting from 0.
wrt_dt+ 
wrt_dt+ # You can obtain the same result as as the previous examples using arrow with
wrt_dt+ # a dplyr pipeline. This will be the same as two_levels_tree above, but the
wrt_dt+ # output directory will be different.
wrt_dt+ library(dplyr)
wrt_dt+ two_levels_tree_2 <- tempfile()
wrt_dt+ mtcars %>%
wrt_dt+   group_by(cyl, gear) %>%
wrt_dt+   write_dataset(two_levels_tree_2)
wrt_dt+ list.files(two_levels_tree_2, recursive = TRUE)
wrt_dt+ 
wrt_dt+ # And you can also turn off the Hive-style directory naming where the column
wrt_dt+ # name is included with the values by using `hive_style = FALSE`.
wrt_dt+ 
wrt_dt+ # Write a structure X/Y/part-Z.parquet.
wrt_dt+ two_levels_tree_no_hive <- tempfile()
wrt_dt+ mtcars %>%
wrt_dt+   group_by(cyl, gear) %>%
wrt_dt+   write_dataset(two_levels_tree_no_hive, hive_style = FALSE)
wrt_dt+ list.files(two_levels_tree_no_hive, recursive = TRUE)
wrt_dt+ ## Don't show: 
wrt_dt+ }) # examplesIf
> one_level_tree <- tempfile()
> write_dataset(mtcars, one_level_tree, partitioning = "cyl")

 *** caught illegal operation ***
address 0x7f8cb6c1faa7, cause 'illegal operand'

Traceback:
 1: ExecPlan_Write(self, node, prepare_key_value_metadata(node$final_metadata()),     ...)
 2: plan$Write(final_node, options, path_and_fs$fs, path_and_fs$path,     partitioning, basename_template, existing_data_behavior,     max_partitions, max_open_files, max_rows_per_file, min_rows_per_group,     max_rows_per_group)
 3: write_dataset(mtcars, one_level_tree, partitioning = "cyl")
 4: eval(ei, envir)
 5: eval(ei, envir)
 6: withVisible(eval(ei, envir))
 7: source(exprs = exprs, local = local, print.eval = print., echo = echo,     max.deparse.length = max.deparse.length, width.cutoff = width.cutoff,     deparseCtrl = deparseCtrl, ...)
 8: (if (getRversion() >= "3.4") withAutoprint else force)({    one_level_tree <- tempfile()    write_dataset(mtcars, one_level_tree, partitioning = "cyl")    list.files(one_level_tree, recursive = TRUE)    two_levels_tree <- tempfile()    write_dataset(mtcars, two_levels_tree, partitioning = c("cyl",         "gear"))    list.files(two_levels_tree, recursive = TRUE)    library(dplyr)    two_levels_tree_2 <- tempfile()    mtcars %>% group_by(cyl, gear) %>% write_dataset(two_levels_tree_2)    list.files(two_levels_tree_2, recursive = TRUE)    two_levels_tree_no_hive <- tempfile()    mtcars %>% group_by(cyl, gear) %>% write_dataset(two_levels_tree_no_hive,         hive_style = FALSE)    list.files(two_levels_tree_no_hive, recursive = TRUE)})
 9: eval(ei, envir)
10: eval(ei, envir)
11: withVisible(eval(ei, envir))
12: source(tf, local, echo = echo, prompt.echo = paste0(prompt.prefix,     getOption("prompt")), continue.echo = paste0(prompt.prefix,     getOption("continue")), verbose = verbose, max.deparse.length = Inf,     encoding = "UTF-8", skip.echo = skips, keep.source = TRUE)
13: example("write_dataset", package = "arrow")
An irrecoverable exception occurred. R is aborting now ...
Instruction non permise (core dumped)
(base) tdhock@tdhock-MacBook:/tmp/Rtmp8icqQo/downloaded_packages$ R -d gdb
GNU gdb (Ubuntu 12.1-0ubuntu1~22.04) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/tdhock/lib/R/bin/exec/R...
(gdb) run
Starting program: /home/tdhock/lib/R/bin/exec/R 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/x86_64-linux-gnu/libthread_db.so.1".

R version 4.3.0 (2023-04-21) -- "Already Tomorrow"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R est un logiciel libre livré sans AUCUNE GARANTIE.
Vous pouvez le redistribuer sous certaines conditions.
Tapez 'license()' ou 'licence()' pour plus de détails.

R est un projet collaboratif avec de nombreux contributeurs.
Tapez 'contributors()' pour plus d'information et
'citation()' pour la façon de le citer dans les publications.

Tapez 'demo()' pour des démonstrations, 'help()' pour l'aide
en ligne ou 'help.start()' pour obtenir l'aide au format HTML.
Tapez 'q()' pour quitter R.

Le chargement a nécessité le package : grDevices
[Detaching after vfork from child process 30368]
[Detaching after vfork from child process 30370]
> example("write_dataset",package="arrow")
[New Thread 0x7fffee041640 (LWP 30373)]
[New Thread 0x7fffe8bff640 (LWP 30374)]

Attachement du package : ‘arrow’

L'objet suivant est masqué depuis ‘package:utils’:

    timestamp

wrt_dt> ## Don't show: 
wrt_dt> if (arrow_with_dataset() & arrow_with_parquet() & requireNamespace("dplyr", quietly = TRUE)) (if (getRversion() >= "3.4") withAutoprint else force)({ # examplesIf
wrt_dt+ ## End(Don't show)
wrt_dt+ # You can write datasets partitioned by the values in a column (here: "cyl").
wrt_dt+ # This creates a structure of the form cyl=X/part-Z.parquet.
wrt_dt+ one_level_tree <- tempfile()
wrt_dt+ write_dataset(mtcars, one_level_tree, partitioning = "cyl")
wrt_dt+ list.files(one_level_tree, recursive = TRUE)
wrt_dt+ 
wrt_dt+ # You can also partition by the values in multiple columns
wrt_dt+ # (here: "cyl" and "gear").
wrt_dt+ # This creates a structure of the form cyl=X/gear=Y/part-Z.parquet.
wrt_dt+ two_levels_tree <- tempfile()
wrt_dt+ write_dataset(mtcars, two_levels_tree, partitioning = c("cyl", "gear"))
wrt_dt+ list.files(two_levels_tree, recursive = TRUE)
wrt_dt+ 
wrt_dt+ # In the two previous examples we would have:
wrt_dt+ # X = {4,6,8}, the number of cylinders.
wrt_dt+ # Y = {3,4,5}, the number of forward gears.
wrt_dt+ # Z = {0,1,2}, the number of saved parts, starting from 0.
wrt_dt+ 
wrt_dt+ # You can obtain the same result as as the previous examples using arrow with
wrt_dt+ # a dplyr pipeline. This will be the same as two_levels_tree above, but the
wrt_dt+ # output directory will be different.
wrt_dt+ library(dplyr)
wrt_dt+ two_levels_tree_2 <- tempfile()
wrt_dt+ mtcars %>%
wrt_dt+   group_by(cyl, gear) %>%
wrt_dt+   write_dataset(two_levels_tree_2)
wrt_dt+ list.files(two_levels_tree_2, recursive = TRUE)
wrt_dt+ 
wrt_dt+ # And you can also turn off the Hive-style directory naming where the column
wrt_dt+ # name is included with the values by using `hive_style = FALSE`.
wrt_dt+ 
wrt_dt+ # Write a structure X/Y/part-Z.parquet.
wrt_dt+ two_levels_tree_no_hive <- tempfile()
wrt_dt+ mtcars %>%
wrt_dt+   group_by(cyl, gear) %>%
wrt_dt+   write_dataset(two_levels_tree_no_hive, hive_style = FALSE)
wrt_dt+ list.files(two_levels_tree_no_hive, recursive = TRUE)
wrt_dt+ ## Don't show: 
wrt_dt+ }) # examplesIf
> one_level_tree <- tempfile()
> write_dataset(mtcars, one_level_tree, partitioning = "cyl")
[New Thread 0x7fffe3fff640 (LWP 30375)]
[New Thread 0x7fffe366f640 (LWP 30376)]
[New Thread 0x7fffe2cdf640 (LWP 30377)]
[New Thread 0x7fffe234f640 (LWP 30378)]

Thread 4 "R" received signal SIGILL, Illegal instruction.
[Switching to Thread 0x7fffe3fff640 (LWP 30375)]
0x00007fffeb5aeaa7 in arrow::compute::RowTableMetadata::FromColumnMetadataVector(std::vector<arrow::compute::KeyColumnMetadata, std::allocator<arrow::compute::KeyColumnMetadata> > const&, int, int) () from /home/tdhock/lib/R/library/arrow/libs/arrow.so
(gdb) disassemble
Dump of assembler code for function _ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii:
   0x00007fffeb5ae780 <+0>: endbr64 
   0x00007fffeb5ae784 <+4>: push   %r15
   0x00007fffeb5ae786 <+6>: movd   %edx,%xmm3
   0x00007fffeb5ae78a <+10>:    push   %r14
   0x00007fffeb5ae78c <+12>:    pinsrd $0x1,%ecx,%xmm3
   0x00007fffeb5ae792 <+18>:    push   %r13
   0x00007fffeb5ae794 <+20>:    push   %r12
   0x00007fffeb5ae796 <+22>:    push   %rbp
   0x00007fffeb5ae797 <+23>:    push   %rbx
   0x00007fffeb5ae798 <+24>:    mov    %rdi,%rbx
   0x00007fffeb5ae79b <+27>:    sub    $0x28,%rsp
   0x00007fffeb5ae79f <+31>:    mov    0x8(%rsi),%rax
   0x00007fffeb5ae7a3 <+35>:    mov    (%rsi),%rcx
   0x00007fffeb5ae7a6 <+38>:    mov    0x20(%rbx),%r8
   0x00007fffeb5ae7aa <+42>:    mov    0x18(%rbx),%rdx
   0x00007fffeb5ae7ae <+46>:    mov    %rsi,0x8(%rsp)
   0x00007fffeb5ae7b3 <+51>:    mov    %rax,0x18(%rsp)
   0x00007fffeb5ae7b8 <+56>:    sub    %rcx,%rax
   0x00007fffeb5ae7bb <+59>:    mov    %r8,%rsi
   0x00007fffeb5ae7be <+62>:    mov    %rax,%rdi
   0x00007fffeb5ae7c1 <+65>:    movq   %xmm3,0x10(%rsp)
   0x00007fffeb5ae7c7 <+71>:    sub    %rdx,%rsi
   0x00007fffeb5ae7ca <+74>:    sar    $0x3,%rdi
   0x00007fffeb5ae7ce <+78>:    cmp    %rsi,%rax
   0x00007fffeb5ae7d1 <+81>:    ja     0x7fffeb5aec90 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+1296>
   0x00007fffeb5ae7d7 <+87>:    mov    %rdi,%r12
   0x00007fffeb5ae7da <+90>:    jae    0x7fffeb5ae803 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+131>
   0x00007fffeb5ae7dc <+92>:    add    %rax,%rdx
   0x00007fffeb5ae7df <+95>:    cmp    %rdx,%r8
   0x00007fffeb5ae7e2 <+98>:    je     0x7fffeb5ae803 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+131>
   0x00007fffeb5ae7e4 <+100>:   mov    %rdx,0x20(%rbx)
   0x00007fffeb5ae7e8 <+104>:   mov    0x8(%rsp),%rax
   0x00007fffeb5ae7ed <+109>:   mov    0x8(%rax),%rax
   0x00007fffeb5ae7f1 <+113>:   mov    %rax,%rdi
   0x00007fffeb5ae7f4 <+116>:   mov    %rax,0x18(%rsp)
   0x00007fffeb5ae7f9 <+121>:   sub    %rcx,%rdi
   0x00007fffeb5ae7fc <+124>:   sar    $0x3,%rdi
   0x00007fffeb5ae800 <+128>:   mov    %rdi,%r12
   0x00007fffeb5ae803 <+131>:   mov    0x38(%rbx),%rax
   0x00007fffeb5ae807 <+135>:   mov    0x30(%rbx),%r13
   0x00007fffeb5ae80b <+139>:   mov    %rax,%r8
   0x00007fffeb5ae80e <+142>:   mov    %rax,%r15
   0x00007fffeb5ae811 <+145>:   sub    %r13,%r8
   0x00007fffeb5ae814 <+148>:   sar    $0x2,%r8
   0x00007fffeb5ae818 <+152>:   test   %r12,%r12
   0x00007fffeb5ae81b <+155>:   je     0x7fffeb5aebb6 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+1078>
   0x00007fffeb5ae821 <+161>:   mov    0x18(%rbx),%rsi
   0x00007fffeb5ae825 <+165>:   xor    %eax,%eax
   0x00007fffeb5ae827 <+167>:   nopw   0x0(%rax,%rax,1)
   0x00007fffeb5ae830 <+176>:   mov    (%rcx,%rax,8),%rdx
   0x00007fffeb5ae834 <+180>:   mov    %rdx,(%rsi,%rax,8)
   0x00007fffeb5ae838 <+184>:   add    $0x1,%rax
   0x00007fffeb5ae83c <+188>:   cmp    %r12,%rax
   0x00007fffeb5ae83f <+191>:   jne    0x7fffeb5ae830 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+176>
   0x00007fffeb5ae841 <+193>:   mov    %edi,%r12d
   0x00007fffeb5ae844 <+196>:   mov    %edi,%ebp
   0x00007fffeb5ae846 <+198>:   cmp    %r8,%r12
   0x00007fffeb5ae849 <+201>:   ja     0x7fffeb5aecb8 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+1336>
   0x00007fffeb5ae84f <+207>:   jb     0x7fffeb5aebc8 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+1096>
   0x00007fffeb5ae855 <+213>:   test   %ebp,%ebp
   0x00007fffeb5ae857 <+215>:   je     0x7fffeb5ae8be <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+318>
   0x00007fffeb5ae859 <+217>:   lea    -0x1(%rbp),%eax
   0x00007fffeb5ae85c <+220>:   cmp    $0x2,%eax
   0x00007fffeb5ae85f <+223>:   jbe    0x7fffeb5aecd4 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+1364>
   0x00007fffeb5ae865 <+229>:   mov    %ebp,%edx
   0x00007fffeb5ae867 <+231>:   movdqa 0x11b97d1(%rip),%xmm0        # 0x7fffec768040
   0x00007fffeb5ae86f <+239>:   movdqa 0x11b97d9(%rip),%xmm2        # 0x7fffec768050
   0x00007fffeb5ae877 <+247>:   mov    %r13,%rax
   0x00007fffeb5ae87a <+250>:   shr    $0x2,%edx
   0x00007fffeb5ae87d <+253>:   sub    $0x1,%edx
   0x00007fffeb5ae880 <+256>:   shl    $0x4,%rdx
   0x00007fffeb5ae884 <+260>:   lea    0x10(%r13,%rdx,1),%rdx
   0x00007fffeb5ae889 <+265>:   nopl   0x0(%rax)
   0x00007fffeb5ae890 <+272>:   movdqa %xmm0,%xmm1
   0x00007fffeb5ae894 <+276>:   add    $0x10,%rax
   0x00007fffeb5ae898 <+280>:   paddd  %xmm2,%xmm0
   0x00007fffeb5ae89c <+284>:   movups %xmm1,-0x10(%rax)
   0x00007fffeb5ae8a0 <+288>:   cmp    %rdx,%rax
   0x00007fffeb5ae8a3 <+291>:   jne    0x7fffeb5ae890 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+272>
   0x00007fffeb5ae8a5 <+293>:   mov    %ebp,%eax
   0x00007fffeb5ae8a7 <+295>:   and    $0xfffffffc,%eax
   0x00007fffeb5ae8aa <+298>:   test   $0x3,%bpl
   0x00007fffeb5ae8ae <+302>:   je     0x7fffeb5ae8be <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+318>
   0x00007fffeb5ae8b0 <+304>:   mov    %eax,%edx
   0x00007fffeb5ae8b2 <+306>:   mov    %eax,0x0(%r13,%rdx,4)
   0x00007fffeb5ae8b7 <+311>:   add    $0x1,%eax
   0x00007fffeb5ae8ba <+314>:   cmp    %eax,%ebp
   0x00007fffeb5ae8bc <+316>:   ja     0x7fffeb5ae8b0 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+304>
   0x00007fffeb5ae8be <+318>:   cmp    %r13,%r15
   0x00007fffeb5ae8c1 <+321>:   je     0x7fffeb5aea00 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+640>
   0x00007fffeb5ae8c7 <+327>:   mov    %r15,%r14
   0x00007fffeb5ae8ca <+330>:   mov    $0x3f,%edx
   0x00007fffeb5ae8cf <+335>:   mov    0x8(%rsp),%rcx
   0x00007fffeb5ae8d4 <+340>:   mov    %r15,%rsi
   0x00007fffeb5ae8d7 <+343>:   sub    %r13,%r14
   0x00007fffeb5ae8da <+346>:   mov    %r13,%rdi
   0x00007fffeb5ae8dd <+349>:   mov    %r14,%rax
   0x00007fffeb5ae8e0 <+352>:   sar    $0x2,%rax
   0x00007fffeb5ae8e4 <+356>:   bsr    %rax,%rax
   0x00007fffeb5ae8e8 <+360>:   xor    $0x3f,%rax
   0x00007fffeb5ae8ec <+364>:   sub    %eax,%edx
   0x00007fffeb5ae8ee <+366>:   movslq %edx,%rdx
   0x00007fffeb5ae8f1 <+369>:   add    %rdx,%rdx
   0x00007fffeb5ae8f4 <+372>:   call   0x7fffeb5ac490 <_ZSt16__introsort_loopIN9__gnu_cxx17__normal_iteratorIPjSt6vectorIjSaIjEEEElNS0_5__ops15_Iter_comp_iterIZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKS3_INSA_17KeyColumnMetadataESaISC_EEiiEUljjE_EEEvT_SJ_T0_T1_>
   0x00007fffeb5ae8f9 <+377>:   cmp    $0x40,%r14
   0x00007fffeb5ae8fd <+381>:   jle    0x7fffeb5ae9f0 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+624>
   0x00007fffeb5ae903 <+387>:   lea    0x40(%r13),%r14
   0x00007fffeb5ae907 <+391>:   mov    0x8(%rsp),%rdx
   0x00007fffeb5ae90c <+396>:   mov    %r13,%rdi
   0x00007fffeb5ae90f <+399>:   mov    %r14,%rsi
   0x00007fffeb5ae912 <+402>:   call   0x7fffeb5ac200 <_ZSt16__insertion_sortIN9__gnu_cxx17__normal_iteratorIPjSt6vectorIjSaIjEEEENS0_5__ops15_Iter_comp_iterIZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKS3_INSA_17KeyColumnMetadataESaISC_EEiiEUljjE_EEEvT_SJ_T0_>
   0x00007fffeb5ae917 <+407>:   cmp    %r15,%r14
   0x00007fffeb5ae91a <+410>:   je     0x7fffeb5aea00 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+640>
   0x00007fffeb5ae920 <+416>:   mov    0x8(%rsp),%rax
   0x00007fffeb5ae925 <+421>:   mov    %ebp,0x18(%rsp)
   0x00007fffeb5ae929 <+425>:   mov    (%rax),%r13
   0x00007fffeb5ae92c <+428>:   nopl   0x0(%rax)
   0x00007fffeb5ae930 <+432>:   mov    (%r14),%eax
   0x00007fffeb5ae933 <+435>:   lea    0x0(%r13,%rax,8),%r11
   0x00007fffeb5ae938 <+440>:   mov    %rax,%rbp
   0x00007fffeb5ae93b <+443>:   mov    %r14,%rax
   0x00007fffeb5ae93e <+446>:   movzbl (%r11),%r8d
   0x00007fffeb5ae942 <+450>:   mov    -0x4(%rax),%ecx
   0x00007fffeb5ae945 <+453>:   test   %r8b,%r8b
   0x00007fffeb5ae948 <+456>:   je     0x7fffeb5ae9c5 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+581>
   0x00007fffeb5ae94a <+458>:   nopw   0x0(%rax,%rax,1)
   0x00007fffeb5ae950 <+464>:   mov    %ecx,%esi
   0x00007fffeb5ae952 <+466>:   mov    0x4(%r11),%edx
   0x00007fffeb5ae956 <+470>:   lea    0x0(%r13,%rsi,8),%r9
   0x00007fffeb5ae95b <+475>:   movzbl (%r9),%esi
   0x00007fffeb5ae95f <+479>:   popcnt %rdx,%rdx
   0x00007fffeb5ae964 <+484>:   cmp    $0x1,%edx
   0x00007fffeb5ae967 <+487>:   setle  %dl
   0x00007fffeb5ae96a <+490>:   test   %sil,%sil
   0x00007fffeb5ae96d <+493>:   je     0x7fffeb5aec10 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+1168>
   0x00007fffeb5ae973 <+499>:   mov    0x4(%r9),%esi
   0x00007fffeb5ae977 <+503>:   mov    0x4(%r11),%r10d
   0x00007fffeb5ae97b <+507>:   mov    0x4(%r9),%r9d
   0x00007fffeb5ae97f <+511>:   popcnt %rsi,%rsi
   0x00007fffeb5ae984 <+516>:   cmp    $0x1,%esi
   0x00007fffeb5ae987 <+519>:   setle  %dil
   0x00007fffeb5ae98b <+523>:   mov    $0x1,%esi
   0x00007fffeb5ae990 <+528>:   cmp    %dil,%dl
   0x00007fffeb5ae993 <+531>:   jne    0x7fffeb5ae9af <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+559>
   0x00007fffeb5ae995 <+533>:   test   %dl,%dl
   0x00007fffeb5ae997 <+535>:   je     0x7fffeb5aebe8 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+1128>
   0x00007fffeb5ae99d <+541>:   cmp    %r10d,%r9d
   0x00007fffeb5ae9a0 <+544>:   jne    0x7fffeb5ae9ac <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+556>
   0x00007fffeb5ae9a2 <+546>:   mov    %r8d,%edx
   0x00007fffeb5ae9a5 <+549>:   cmp    %r8b,%sil
   0x00007fffeb5ae9a8 <+552>:   jne    0x7fffeb5ae9af <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+559>
   0x00007fffeb5ae9aa <+554>:   cmp    %ecx,%ebp
   0x00007fffeb5ae9ac <+556>:   setb   %dl
   0x00007fffeb5ae9af <+559>:   test   %dl,%dl
   0x00007fffeb5ae9b1 <+561>:   je     0x7fffeb5aebf0 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+1136>
   0x00007fffeb5ae9b7 <+567>:   mov    %ecx,(%rax)
   0x00007fffeb5ae9b9 <+569>:   sub    $0x4,%rax
   0x00007fffeb5ae9bd <+573>:   mov    -0x4(%rax),%ecx
   0x00007fffeb5ae9c0 <+576>:   test   %r8b,%r8b
   0x00007fffeb5ae9c3 <+579>:   jne    0x7fffeb5ae950 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+464>
   0x00007fffeb5ae9c5 <+581>:   mov    %ecx,%edx
   0x00007fffeb5ae9c7 <+583>:   lea    0x0(%r13,%rdx,8),%rdx
   0x00007fffeb5ae9cc <+588>:   movzbl (%rdx),%esi
   0x00007fffeb5ae9cf <+591>:   test   %sil,%sil
   0x00007fffeb5ae9d2 <+594>:   jne    0x7fffeb5aec28 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+1192>
   0x00007fffeb5ae9d8 <+600>:   mov    $0x1,%edx
   0x00007fffeb5ae9dd <+605>:   mov    $0x4,%r10d
   0x00007fffeb5ae9e3 <+611>:   mov    $0x1,%edi
   0x00007fffeb5ae9e8 <+616>:   mov    $0x4,%r9d
   0x00007fffeb5ae9ee <+622>:   jmp    0x7fffeb5ae990 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+528>
   0x00007fffeb5ae9f0 <+624>:   mov    0x8(%rsp),%rdx
   0x00007fffeb5ae9f5 <+629>:   mov    %r15,%rsi
   0x00007fffeb5ae9f8 <+632>:   mov    %r13,%rdi
   0x00007fffeb5ae9fb <+635>:   call   0x7fffeb5ac200 <_ZSt16__insertion_sortIN9__gnu_cxx17__normal_iteratorIPjSt6vectorIjSaIjEEEENS0_5__ops15_Iter_comp_iterIZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKS3_INSA_17KeyColumnMetadataESaISC_EEiiEUljjE_EEEvT_SJ_T0_>
   0x00007fffeb5aea00 <+640>:   mov    0x50(%rbx),%rdx
   0x00007fffeb5aea04 <+644>:   mov    0x48(%rbx),%rcx
   0x00007fffeb5aea08 <+648>:   mov    %rdx,%rax
   0x00007fffeb5aea0b <+651>:   sub    %rcx,%rax
   0x00007fffeb5aea0e <+654>:   sar    $0x2,%rax
   0x00007fffeb5aea12 <+658>:   cmp    %r12,%rax
   0x00007fffeb5aea15 <+661>:   jb     0x7fffeb5aec78 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+1272>
   0x00007fffeb5aea1b <+667>:   ja     0x7fffeb5aeba0 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+1056>
   0x00007fffeb5aea21 <+673>:   test   %ebp,%ebp
   0x00007fffeb5aea23 <+675>:   je     0x7fffeb5aea47 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+711>
   0x00007fffeb5aea25 <+677>:   mov    0x30(%rbx),%rdi
   0x00007fffeb5aea29 <+681>:   mov    0x48(%rbx),%rsi
   0x00007fffeb5aea2d <+685>:   mov    %ebp,%ecx
   0x00007fffeb5aea2f <+687>:   xor    %eax,%eax
   0x00007fffeb5aea31 <+689>:   nopl   0x0(%rax)
   0x00007fffeb5aea38 <+696>:   mov    (%rdi,%rax,4),%edx
   0x00007fffeb5aea3b <+699>:   mov    %eax,(%rsi,%rdx,4)
   0x00007fffeb5aea3e <+702>:   add    $0x1,%rax
   0x00007fffeb5aea42 <+706>:   cmp    %rcx,%rax
   0x00007fffeb5aea45 <+709>:   jne    0x7fffeb5aea38 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+696>
   0x00007fffeb5aea47 <+711>:   mov    0x10(%rsp),%rax
   0x00007fffeb5aea4c <+716>:   mov    0x68(%rbx),%rdx
   0x00007fffeb5aea50 <+720>:   movl   $0x0,0x8(%rbx)
   0x00007fffeb5aea57 <+727>:   mov    0x60(%rbx),%rcx
   0x00007fffeb5aea5b <+731>:   mov    %rax,0x10(%rbx)
   0x00007fffeb5aea5f <+735>:   mov    %rdx,%rax
   0x00007fffeb5aea62 <+738>:   sub    %rcx,%rax
   0x00007fffeb5aea65 <+741>:   sar    $0x2,%rax
   0x00007fffeb5aea69 <+745>:   cmp    %r12,%rax
   0x00007fffeb5aea6c <+748>:   jb     0x7fffeb5aec60 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+1248>
   0x00007fffeb5aea72 <+754>:   ja     0x7fffeb5aeb60 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+992>
   0x00007fffeb5aea78 <+760>:   test   %ebp,%ebp
   0x00007fffeb5aea7a <+762>:   je     0x7fffeb5aeb79 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+1017>
   0x00007fffeb5aea80 <+768>:   mov    0x8(%rsp),%rax
   0x00007fffeb5aea85 <+773>:   mov    %ebp,%r9d
   0x00007fffeb5aea88 <+776>:   mov    0x30(%rbx),%r11
   0x00007fffeb5aea8c <+780>:   xor    %edx,%edx
   0x00007fffeb5aea8e <+782>:   mov    0x60(%rbx),%r10
   0x00007fffeb5aea92 <+786>:   shl    $0x2,%r9
   0x00007fffeb5aea96 <+790>:   xor    %r8d,%r8d
   0x00007fffeb5aea99 <+793>:   mov    (%rax),%r12
   0x00007fffeb5aea9c <+796>:   xor    %eax,%eax
   0x00007fffeb5aea9e <+798>:   jmp    0x7fffeb5aeadb <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+859>
   0x00007fffeb5aeaa0 <+800>:   mov    0x4(%rcx),%esi
   0x00007fffeb5aeaa3 <+803>:   test   %esi,%esi
   0x00007fffeb5aeaa5 <+805>:   je     0x7fffeb5aeac2 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+834>
=> 0x00007fffeb5aeaa7 <+807>:   popcnt %rsi,%rsi
   0x00007fffeb5aeaac <+812>:   cmp    $0x1,%esi
   0x00007fffeb5aeaaf <+815>:   je     0x7fffeb5aeac2 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+834>
   0x00007fffeb5aeab1 <+817>:   mov    0x14(%rbx),%esi
   0x00007fffeb5aeab4 <+820>:   mov    %eax,%r13d
   0x00007fffeb5aeab7 <+823>:   neg    %r13d
   0x00007fffeb5aeaba <+826>:   sub    $0x1,%esi
   0x00007fffeb5aeabd <+829>:   and    %r13d,%esi
   0x00007fffeb5aeac0 <+832>:   add    %esi,%eax
   0x00007fffeb5aeac2 <+834>:   mov    %eax,(%rdi)
   0x00007fffeb5aeac4 <+836>:   mov    0x4(%rcx),%ecx
   0x00007fffeb5aeac7 <+839>:   lea    (%rax,%rcx,1),%esi
   0x00007fffeb5aeaca <+842>:   add    $0x1,%eax
   0x00007fffeb5aeacd <+845>:   test   %ecx,%ecx
   0x00007fffeb5aeacf <+847>:   cmovne %esi,%eax
   0x00007fffeb5aead2 <+850>:   add    $0x4,%rdx
   0x00007fffeb5aead6 <+854>:   cmp    %r9,%rdx
   0x00007fffeb5aead9 <+857>:   je     0x7fffeb5aeb06 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+902>
   0x00007fffeb5aeadb <+859>:   mov    (%r11,%rdx,1),%ecx
   0x00007fffeb5aeadf <+863>:   lea    (%r10,%rdx,1),%rdi
   0x00007fffeb5aeae3 <+867>:   lea    (%r12,%rcx,8),%rcx
   0x00007fffeb5aeae7 <+871>:   cmpb   $0x0,(%rcx)
   0x00007fffeb5aeaea <+874>:   jne    0x7fffeb5aeaa0 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+800>
   0x00007fffeb5aeaec <+876>:   mov    %eax,(%rdi)
   0x00007fffeb5aeaee <+878>:   test   %r8d,%r8d
   0x00007fffeb5aeaf1 <+881>:   jne    0x7fffeb5aeaf6 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+886>
   0x00007fffeb5aeaf3 <+883>:   mov    %eax,0x8(%rbx)
   0x00007fffeb5aeaf6 <+886>:   add    $0x4,%rdx
   0x00007fffeb5aeafa <+890>:   add    $0x1,%r8d
   0x00007fffeb5aeafe <+894>:   add    $0x4,%eax
   0x00007fffeb5aeb01 <+897>:   cmp    %r9,%rdx
   0x00007fffeb5aeb04 <+900>:   jne    0x7fffeb5aeadb <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+859>
   0x00007fffeb5aeb06 <+902>:   test   %r8d,%r8d
   0x00007fffeb5aeb09 <+905>:   mov    %eax,%ecx
   0x00007fffeb5aeb0b <+907>:   sete   (%rbx)
   0x00007fffeb5aeb0e <+910>:   neg    %ecx
   0x00007fffeb5aeb10 <+912>:   test   %r8d,%r8d
   0x00007fffeb5aeb13 <+915>:   je     0x7fffeb5aec50 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+1232>
   0x00007fffeb5aeb19 <+921>:   mov    0x14(%rbx),%edi
   0x00007fffeb5aeb1c <+924>:   lea    -0x1(%rdi),%edx
   0x00007fffeb5aeb1f <+927>:   and    %ecx,%edx
   0x00007fffeb5aeb21 <+929>:   add    %edx,%eax
   0x00007fffeb5aeb23 <+931>:   mov    %eax,0x4(%rbx)
   0x00007fffeb5aeb26 <+934>:   movl   $0x1,0xc(%rbx)
   0x00007fffeb5aeb2d <+941>:   cmp    $0x8,%ebp
   0x00007fffeb5aeb30 <+944>:   jbe    0x7fffeb5aeb4e <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+974>
   0x00007fffeb5aeb32 <+946>:   mov    $0x1,%edx
   0x00007fffeb5aeb37 <+951>:   nopw   0x0(%rax,%rax,1)
   0x00007fffeb5aeb40 <+960>:   mov    %edx,%eax
   0x00007fffeb5aeb42 <+962>:   add    %edx,%edx
   0x00007fffeb5aeb44 <+964>:   shl    $0x4,%eax
   0x00007fffeb5aeb47 <+967>:   cmp    %ebp,%eax
   0x00007fffeb5aeb49 <+969>:   jb     0x7fffeb5aeb40 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+960>
   0x00007fffeb5aeb4b <+971>:   mov    %edx,0xc(%rbx)
   0x00007fffeb5aeb4e <+974>:   add    $0x28,%rsp
   0x00007fffeb5aeb52 <+978>:   pop    %rbx
   0x00007fffeb5aeb53 <+979>:   pop    %rbp
   0x00007fffeb5aeb54 <+980>:   pop    %r12
   0x00007fffeb5aeb56 <+982>:   pop    %r13
   0x00007fffeb5aeb58 <+984>:   pop    %r14
   0x00007fffeb5aeb5a <+986>:   pop    %r15
   0x00007fffeb5aeb5c <+988>:   ret    
   0x00007fffeb5aeb5d <+989>:   nopl   (%rax)
   0x00007fffeb5aeb60 <+992>:   lea    (%rcx,%r12,4),%rax
   0x00007fffeb5aeb64 <+996>:   cmp    %rax,%rdx
   0x00007fffeb5aeb67 <+999>:   je     0x7fffeb5aea78 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+760>
   0x00007fffeb5aeb6d <+1005>:  mov    %rax,0x68(%rbx)
   0x00007fffeb5aeb71 <+1009>:  test   %ebp,%ebp
   0x00007fffeb5aeb73 <+1011>:  jne    0x7fffeb5aea80 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+768>
   0x00007fffeb5aeb79 <+1017>:  movb   $0x1,(%rbx)
   0x00007fffeb5aeb7c <+1020>:  movl   $0x0,0x4(%rbx)
   0x00007fffeb5aeb83 <+1027>:  movl   $0x1,0xc(%rbx)
   0x00007fffeb5aeb8a <+1034>:  add    $0x28,%rsp
   0x00007fffeb5aeb8e <+1038>:  pop    %rbx
   0x00007fffeb5aeb8f <+1039>:  pop    %rbp
   0x00007fffeb5aeb90 <+1040>:  pop    %r12
   0x00007fffeb5aeb92 <+1042>:  pop    %r13
   0x00007fffeb5aeb94 <+1044>:  pop    %r14
   0x00007fffeb5aeb96 <+1046>:  pop    %r15
   0x00007fffeb5aeb98 <+1048>:  ret    
   0x00007fffeb5aeb99 <+1049>:  nopl   0x0(%rax)
   0x00007fffeb5aeba0 <+1056>:  lea    (%rcx,%r12,4),%rax
   0x00007fffeb5aeba4 <+1060>:  cmp    %rax,%rdx
   0x00007fffeb5aeba7 <+1063>:  je     0x7fffeb5aea21 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+673>
   0x00007fffeb5aebad <+1069>:  mov    %rax,0x50(%rbx)
   0x00007fffeb5aebb1 <+1073>:  jmp    0x7fffeb5aea21 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+673>
   0x00007fffeb5aebb6 <+1078>:  xor    %ebp,%ebp
   0x00007fffeb5aebb8 <+1080>:  test   %r8,%r8
   0x00007fffeb5aebbb <+1083>:  je     0x7fffeb5ae8be <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+318>
   0x00007fffeb5aebc1 <+1089>:  nopl   0x0(%rax)
   0x00007fffeb5aebc8 <+1096>:  lea    0x0(%r13,%r12,4),%rax
   0x00007fffeb5aebcd <+1101>:  cmp    %r15,%rax
   0x00007fffeb5aebd0 <+1104>:  je     0x7fffeb5ae855 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+213>
   0x00007fffeb5aebd6 <+1110>:  mov    %rax,0x38(%rbx)
   0x00007fffeb5aebda <+1114>:  mov    %rax,%r15
   0x00007fffeb5aebdd <+1117>:  jmp    0x7fffeb5ae855 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+213>
   0x00007fffeb5aebe2 <+1122>:  nopw   0x0(%rax,%rax,1)
   0x00007fffeb5aebe8 <+1128>:  cmp    %ecx,%ebp
   0x00007fffeb5aebea <+1130>:  jb     0x7fffeb5ae9b7 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+567>
   0x00007fffeb5aebf0 <+1136>:  add    $0x4,%r14
   0x00007fffeb5aebf4 <+1140>:  mov    %ebp,(%rax)
   0x00007fffeb5aebf6 <+1142>:  cmp    %r15,%r14
   0x00007fffeb5aebf9 <+1145>:  jne    0x7fffeb5ae930 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+432>
   0x00007fffeb5aebff <+1151>:  mov    0x18(%rsp),%ebp
   0x00007fffeb5aec03 <+1155>:  jmp    0x7fffeb5aea00 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+640>
   0x00007fffeb5aec08 <+1160>:  nopl   0x0(%rax,%rax,1)
   0x00007fffeb5aec10 <+1168>:  mov    0x4(%r11),%r10d
   0x00007fffeb5aec14 <+1172>:  mov    %r8d,%edi
   0x00007fffeb5aec17 <+1175>:  mov    $0x4,%r9d
   0x00007fffeb5aec1d <+1181>:  jmp    0x7fffeb5ae990 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+528>
   0x00007fffeb5aec22 <+1186>:  nopw   0x0(%rax,%rax,1)
   0x00007fffeb5aec28 <+1192>:  mov    0x4(%rdx),%edx
   0x00007fffeb5aec2b <+1195>:  mov    $0x4,%r10d
   0x00007fffeb5aec31 <+1201>:  mov    %rdx,%r9
   0x00007fffeb5aec34 <+1204>:  popcnt %rdx,%rdx
   0x00007fffeb5aec39 <+1209>:  cmp    $0x1,%edx
   0x00007fffeb5aec3c <+1212>:  mov    %esi,%edx
   0x00007fffeb5aec3e <+1214>:  setle  %dil
   0x00007fffeb5aec42 <+1218>:  jmp    0x7fffeb5ae98b <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+523>
   0x00007fffeb5aec47 <+1223>:  nopw   0x0(%rax,%rax,1)
   0x00007fffeb5aec50 <+1232>:  mov    0x10(%rbx),%edi
   0x00007fffeb5aec53 <+1235>:  lea    -0x1(%rdi),%edx
   0x00007fffeb5aec56 <+1238>:  and    %ecx,%edx
   0x00007fffeb5aec58 <+1240>:  add    %edx,%eax
   0x00007fffeb5aec5a <+1242>:  jmp    0x7fffeb5aeb23 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+931>
   0x00007fffeb5aec5f <+1247>:  nop
   0x00007fffeb5aec60 <+1248>:  mov    %r12,%rsi
   0x00007fffeb5aec63 <+1251>:  lea    0x60(%rbx),%rdi
   0x00007fffeb5aec67 <+1255>:  sub    %rax,%rsi
   0x00007fffeb5aec6a <+1258>:  call   0x7fffeab039e0 <_ZNSt6vectorIjSaIjEE17_M_default_appendEm@plt>
   0x00007fffeb5aec6f <+1263>:  jmp    0x7fffeb5aea78 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+760>
   0x00007fffeb5aec74 <+1268>:  nopl   0x0(%rax)
   0x00007fffeb5aec78 <+1272>:  mov    %r12,%rsi
   0x00007fffeb5aec7b <+1275>:  lea    0x48(%rbx),%rdi
   0x00007fffeb5aec7f <+1279>:  sub    %rax,%rsi
   0x00007fffeb5aec82 <+1282>:  call   0x7fffeab039e0 <_ZNSt6vectorIjSaIjEE17_M_default_appendEm@plt>
   0x00007fffeb5aec87 <+1287>:  jmp    0x7fffeb5aea21 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+673>
   0x00007fffeb5aec8c <+1292>:  nopl   0x0(%rax)
   0x00007fffeb5aec90 <+1296>:  sar    $0x3,%rsi
   0x00007fffeb5aec94 <+1300>:  sub    %rsi,%rdi
   0x00007fffeb5aec97 <+1303>:  mov    %rdi,%r8
   0x00007fffeb5aec9a <+1306>:  lea    0x18(%rbx),%rdi
   0x00007fffeb5aec9e <+1310>:  mov    %r8,%rsi
   0x00007fffeb5aeca1 <+1313>:  call   0x7fffeab03080 <_ZNSt6vectorIN5arrow7compute17KeyColumnMetadataESaIS2_EE17_M_default_appendEm@plt>
   0x00007fffeb5aeca6 <+1318>:  mov    0x8(%rsp),%rax
   0x00007fffeb5aecab <+1323>:  mov    (%rax),%rcx
   0x00007fffeb5aecae <+1326>:  jmp    0x7fffeb5ae7ed <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+109>
   0x00007fffeb5aecb3 <+1331>:  nopl   0x0(%rax,%rax,1)
   0x00007fffeb5aecb8 <+1336>:  mov    %r12,%rsi
   0x00007fffeb5aecbb <+1339>:  lea    0x30(%rbx),%rdi
   0x00007fffeb5aecbf <+1343>:  sub    %r8,%rsi
   0x00007fffeb5aecc2 <+1346>:  call   0x7fffeab039e0 <_ZNSt6vectorIjSaIjEE17_M_default_appendEm@plt>
   0x00007fffeb5aecc7 <+1351>:  mov    0x30(%rbx),%r13
   0x00007fffeb5aeccb <+1355>:  mov    0x38(%rbx),%r15
   0x00007fffeb5aeccf <+1359>:  jmp    0x7fffeb5ae855 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+213>
   0x00007fffeb5aecd4 <+1364>:  xor    %eax,%eax
   0x00007fffeb5aecd6 <+1366>:  jmp    0x7fffeb5ae8b0 <_ZN5arrow7compute16RowTableMetadata24FromColumnMetadataVectorERKSt6vectorINS0_17KeyColumnMetadataESaIS3_EEii+304>
End of assembler dump.
(gdb) q
A debugging session is active.

    Inferior 1 [process 30365] will be killed.

Quit anyway? (y or n) y
(base) tdhock@tdhock-MacBook:/tmp/Rtmp8icqQo/downloaded_packages$ cat /proc/cpuinfo 
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 23
model name  : Intel(R) Core(TM)2 Duo CPU     P7350  @ 2.00GHz
stepping    : 6
microcode   : 0x60f
cpu MHz     : 1591.880
cache size  : 3072 KB
physical id : 0
siblings    : 2
core id     : 0
cpu cores   : 2
apicid      : 0
initial apicid  : 0
fpu     : yes
fpu_exception   : yes
cpuid level : 10
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 lahf_lm pti tpr_shadow vnmi flexpriority vpid dtherm
vmx flags   : vnmi flexpriority tsc_offset vtpr vapic
bugs        : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown
bogomips    : 3979.70
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model       : 23
model name  : Intel(R) Core(TM)2 Duo CPU     P7350  @ 2.00GHz
stepping    : 6
microcode   : 0x60f
cpu MHz     : 1591.879
cache size  : 3072 KB
physical id : 0
siblings    : 2
core id     : 1
cpu cores   : 2
apicid      : 1
initial apicid  : 1
fpu     : yes
fpu_exception   : yes
cpuid level : 10
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 lahf_lm pti tpr_shadow vnmi flexpriority vpid dtherm
vmx flags   : vnmi flexpriority tsc_offset vtpr vapic
bugs        : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown
bogomips    : 3979.70
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

(base) tdhock@tdhock-MacBook:/tmp/Rtmp8icqQo/downloaded_packages$ 
thisisnic commented 1 year ago

I'm a little surprised that this would occur in R/Linux. I thought that R/Linux always built Arrow from source (and the compiler should have realized and avoided the popcnt instruction). Do you know how r-arrow is getting installed? Is it getting a prebuilt binary from somewhere or building a new library?

With R on Linux, we build from source, unless an environment variable like NOT_CRAN is set, in which case we download a binary. https://arrow.apache.org/docs/r/articles/install.html

tdhock commented 1 year ago

I see. So probably not a bug in GCC because I was only compiling the R bindings, not libarrow itself, right? in that case I think the solution would be to build arrow C++ library from source, with -march=core2 flag, can you please give advice about how to do that? I tried looking for instructions about how to add that flag to the g++ command lines used for building libarrow C++ but I was not able to find instructions, am I looking in the wrong place? https://arrow.apache.org/docs/developers/cpp/building.html and https://arrow.apache.org/docs/r/articles/install.html#install-release-version-less-easy

westonpace commented 1 year ago

Here is a minimum reproducible example. You can experiment with the various flags. I agree that the -march flag may play a role here. I don't have a machine like this to experiment with unfortunately.

int main() {
  return __builtin_popcount(17) == 2 ? 0 : 1;
}

If this works on your machine then there is probably more investigation for us to do on the Arrow side. If it fails then it is a gcc bug in this case.

westonpace commented 1 year ago

Actually, maybe that example isn't sufficient. It doesn't seem to generate a popcnt instruction, even on my system with -march=native when I know my CPU supports popcnt. Perhaps it's compiling it out. I'll try some more experiments.

tdhock commented 1 year ago

Well I tried building libarrow from source, then building R package from source (linking against my newly built libarrow), but I get the same segfault. I think there is an issue with the C++ build, which gives me the following output:

(arrow) tdhock@maude-MacBookPro:~/arrow-git/cpp/build(main)$ CC=$HOME/bin/gcc CXX=$HOME/bin/g++ cmake .. --preset ninja-debug-basic -DCMAKE_INSTALL_PREFIX=$HOME -DARROW_CXXFLAGS=-march=core2 -DARROW_PARQUET=ON 
Preset CMake variables:

  ARROW_BUILD_INTEGRATION="ON"
  ARROW_BUILD_STATIC="OFF"
  ARROW_BUILD_TESTS="ON"
  ARROW_COMPUTE="ON"
  ARROW_CSV="ON"
  ARROW_DATASET="ON"
  ARROW_EXTRA_ERROR_CONTEXT="ON"
  ARROW_FILESYSTEM="ON"
  ARROW_JSON="ON"
  ARROW_WITH_RE2="OFF"
  ARROW_WITH_UTF8PROC="OFF"
  CMAKE_BUILD_TYPE="Debug"

-- Building using CMake version: 3.22.1
-- Arrow version: 13.0.0 (full: '13.0.0-SNAPSHOT')
-- Arrow SO version: 1300 (full: 1300.0.0)
...
-- CMAKE_C_FLAGS:   -Wall -Wno-conversion -Wno-sign-conversion -Wunused-result -fno-semantic-interposition -msse4.2 -march=core2
-- CMAKE_CXX_FLAGS:  -Wno-noexcept-type  -fdiagnostics-color=always  -Wall -Wno-conversion -Wno-sign-conversion -Wunused-result -fno-semantic-interposition -msse4.2 -march=core2
-- CMAKE_C_FLAGS_DEBUG: -g -Werror -O0 -ggdb
-- CMAKE_CXX_FLAGS_DEBUG: -g -Werror -O0 -ggdb
-- ---------------------------------------------------------------------
-- Arrow version:                                 13.0.0-SNAPSHOT
-- 
-- Build configuration summary:
--   Generator: Ninja
--   Build type: DEBUG
--   Source directory: /home/tdhock/arrow-git/cpp
--   Install prefix: /home/tdhock
-- 
-- Compile and link options:
-- 
--   ARROW_CXXFLAGS=-march=core2 [default=""]
--       Compiler flags to append when compiling Arrow
...

Note above that I used -DARROW_CXXFLAGS=-march=core2 on the command line to tell it to compile for my core2 CPU, but there is an additional flag, -msse4.2 in CMAKE_CXX_FLAGS that appears by default, and seems to be incorrect. GCC man page below explains that SSE4.2 is not supported on my core 2 CPU, but was actually introduced on the next generation of CPUs (nehalem),

-march=cpu-type
...
core2
   Intel Core 2 CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3,
   SSSE3, CX16, SAHF and FXSR instruction set support.

nehalem
   Intel Nehalem CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3,
   SSSE3, SSE4.1, SSE4.2, POPCNT, CX16, SAHF and FXSR instruction
   set support.

It seems that there is documentation about the -msse4.2 flag being default https://github.com/apache/arrow/blob/1624d5aaf4f524487079066dc730176d82b986f5/docs/source/cpp/env_vars.rst and how to disable that flag by setting cmake variable ARROW_SIMD_LEVEL=NONE, so I will try that. However it seems like this should be easy to detect in your cmake config at build time, by running lscpu |grep sse4_2, etc, so I would have expected a warning or error such as "your CPU does not support sse4.2, but the compile is using the -msse4.2 flag, so the compiled libarrow binaries will not work on your CPU. If you want to run libarrow on your CPU, you need to disable this flag by setting the cmake variable ARROW_SIMD_LEVEL=NONE" or similar. (this is pretty much what is explained in the docs, but they are not so easy to find, would be much more user-friendly to output a warning/error like this during build time)

tdhock commented 1 year ago

write_dataset works (no segfault) if I build libarrow from source,

(arrow) tdhock@maude-MacBookPro:~/arrow-git/cpp/build(main*)$ CC=$HOME/bin/gcc CXX=$HOME/bin/g++ cmake .. --preset ninja-debug-basic -DCMAKE_INSTALL_PREFIX=$HOME -DARROW_CXXFLAGS=-march=core2 -DARROW_PARQUET=ON -DARROW_SIMD_LEVEL=NONE
Preset CMake variables:

  ARROW_BUILD_INTEGRATION="ON"
  ARROW_BUILD_STATIC="OFF"
  ARROW_BUILD_TESTS="ON"
  ARROW_COMPUTE="ON"
  ARROW_CSV="ON"
  ARROW_DATASET="ON"
  ARROW_EXTRA_ERROR_CONTEXT="ON"
  ARROW_FILESYSTEM="ON"
  ARROW_JSON="ON"
  ARROW_WITH_RE2="OFF"
  ARROW_WITH_UTF8PROC="OFF"
  CMAKE_BUILD_TYPE="Debug"

-- Building using CMake version: 3.22.1
-- Arrow version: 13.0.0 (full: '13.0.0-SNAPSHOT')
-- Arrow SO version: 1300 (full: 1300.0.0)
...
-- CMAKE_C_FLAGS:   -Wall -Wno-conversion -Wno-sign-conversion -Wunused-result -fno-semantic-interposition -march=core2
-- CMAKE_CXX_FLAGS:  -Wno-noexcept-type  -fdiagnostics-color=always  -Wall -Wno-conversion -Wno-sign-conversion -Wunused-result -fno-semantic-interposition -march=core2
...
-- Compile and link options:
-- 
--   ARROW_CXXFLAGS=-march=core2 [default=""]
--       Compiler flags to append when compiling Arrow
...
--   ARROW_SIMD_LEVEL=NONE [default=NONE|SSE4_2|AVX2|AVX512|NEON|SVE|SVE128|SVE256|SVE512|DEFAULT]
--       Compile-time SIMD optimization level
...
-- Build files have been written to: /home/tdhock/arrow-git/cpp/build
(arrow) tdhock@maude-MacBookPro:~/arrow-git/cpp/build(main*)$ cmake --build . --target clean 
[0/1] Re-running CMake...
-- Building using CMake version: 3.22.1
-- Arrow version: 13.0.0 (full: '13.0.0-SNAPSHOT')
-- Arrow SO version: 1300 (full: 1300.0.0)
...
-- Build files have been written to: /home/tdhock/arrow-git/cpp/build
[1/1] Cleaning all built files...
Cleaning... 653 files.
(arrow) tdhock@maude-MacBookPro:~/arrow-git/cpp/build(main*)$ cmake --build . 
[1/642] Creating directories for 'jemalloc_ep'
[2/642] Creating directories for 'googletest_ep'
[3/642] Performing download step (download, verify and extract) for 'googletest_ep'
[4/642] No update step for 'googletest_ep'
[5/642] No patch step for 'googletest_ep'
[6/642] Performing download step (download, verify and extract) for 'jemalloc_ep'
[7/642] No update step for 'jemalloc_ep'
[8/642] Performing patch step for 'jemalloc_ep'
[9/642] Performing configure step for 'googletest_ep'
[10/642] Performing build step for 'googletest_ep'
[11/642] Performing install step for 'googletest_ep'
[12/642] Completed 'googletest_ep'
[13/642] Performing configure step for 'jemalloc_ep'
[14/642] Performing build step for 'jemalloc_ep'
[15/642] Performing install step for 'jemalloc_ep'
[16/642] Completed 'jemalloc_ep'
[17/642] Building CXX object src/arrow/CMakeFiles/arrow_objlib.dir/array/array_binary.cc.o
...
[641/642] Building CXX object src/parquet/CMakeFiles/parquet-arrow-test.dir/arrow/arrow_reader_writer_test.cc.o
[642/642] Linking CXX executable debug/parquet-arrow-test
(arrow) tdhock@maude-MacBookPro:~/arrow-git/cpp/build(main*)$ cmake --install . 
-- Install configuration: "DEBUG"
-- Up-to-date: /home/tdhock/lib/cmake/Arrow/FindThriftAlt.cmake
-- Installing: /home/tdhock/include/arrow/util/config.h
...
-- Installing: /home/tdhock/lib/libparquet.so.1300.0.0
-- Up-to-date: /home/tdhock/lib/libparquet.so.1300
...
-- Up-to-date: /home/tdhock/include/parquet/encryption/two_level_cache_with_expiration.h
(arrow) tdhock@maude-MacBookPro:~/arrow-git/cpp/build(main*)$ ARROW_PARQUET=true ARROW_R_WITH_PARQUET=true ARROW_DEPENDENCY_SOURCE=SYSTEM ARROW_R_DEV=true LIBARROW_BINARY=false PKG_CONFIG_PATH=$HOME/lib/pkgconfig:$CONDA_PREFIX/lib/pkgconfig R CMD INSTALL ../../r
Loading required package: grDevices
* installing to library ‘/home/tdhock/lib/R/library’
* installing *source* package ‘arrow’ ...
...
** testing if installed package can be loaded from final location
Loading required package: grDevices
** testing if installed package keeps a record of temporary installation path
* DONE (arrow)
(arrow) tdhock@maude-MacBookPro:~/arrow-git/cpp/build(main*)$ R --vanilla -e 'example("write_dataset",package="arrow")'

R version 4.3.0 (2023-04-21) -- "Already Tomorrow"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> example("write_dataset",package="arrow")
Some features are not enabled in this build of Arrow. Run `arrow_info()` for more information.

Attaching package: ‘arrow’

The following object is masked from ‘package:utils’:

    timestamp

wrt_dt> ## Don't show: 
wrt_dt> if (arrow_with_dataset() & arrow_with_parquet() & requireNamespace("dplyr", quietly = TRUE)) (if (getRversion() >= "3.4") withAutoprint else force)({ # examplesIf
wrt_dt+ ## End(Don't show)
wrt_dt+ # You can write datasets partitioned by the values in a column (here: "cyl").
wrt_dt+ # This creates a structure of the form cyl=X/part-Z.parquet.
wrt_dt+ one_level_tree <- tempfile()
wrt_dt+ write_dataset(mtcars, one_level_tree, partitioning = "cyl")
wrt_dt+ list.files(one_level_tree, recursive = TRUE)
wrt_dt+ 
wrt_dt+ # You can also partition by the values in multiple columns
wrt_dt+ # (here: "cyl" and "gear").
wrt_dt+ # This creates a structure of the form cyl=X/gear=Y/part-Z.parquet.
wrt_dt+ two_levels_tree <- tempfile()
wrt_dt+ write_dataset(mtcars, two_levels_tree, partitioning = c("cyl", "gear"))
wrt_dt+ list.files(two_levels_tree, recursive = TRUE)
wrt_dt+ 
wrt_dt+ # In the two previous examples we would have:
wrt_dt+ # X = {4,6,8}, the number of cylinders.
wrt_dt+ # Y = {3,4,5}, the number of forward gears.
wrt_dt+ # Z = {0,1,2}, the number of saved parts, starting from 0.
wrt_dt+ 
wrt_dt+ # You can obtain the same result as as the previous examples using arrow with
wrt_dt+ # a dplyr pipeline. This will be the same as two_levels_tree above, but the
wrt_dt+ # output directory will be different.
wrt_dt+ library(dplyr)
wrt_dt+ two_levels_tree_2 <- tempfile()
wrt_dt+ mtcars %>%
wrt_dt+   group_by(cyl, gear) %>%
wrt_dt+   write_dataset(two_levels_tree_2)
wrt_dt+ list.files(two_levels_tree_2, recursive = TRUE)
wrt_dt+ 
wrt_dt+ # And you can also turn off the Hive-style directory naming where the column
wrt_dt+ # name is included with the values by using `hive_style = FALSE`.
wrt_dt+ 
wrt_dt+ # Write a structure X/Y/part-Z.parquet.
wrt_dt+ two_levels_tree_no_hive <- tempfile()
wrt_dt+ mtcars %>%
wrt_dt+   group_by(cyl, gear) %>%
wrt_dt+   write_dataset(two_levels_tree_no_hive, hive_style = FALSE)
wrt_dt+ list.files(two_levels_tree_no_hive, recursive = TRUE)
wrt_dt+ ## Don't show: 
wrt_dt+ }) # examplesIf
> one_level_tree <- tempfile()
> write_dataset(mtcars, one_level_tree, partitioning = "cyl")
> list.files(one_level_tree, recursive = TRUE)
[1] "cyl=4/part-0.parquet" "cyl=6/part-0.parquet" "cyl=8/part-0.parquet"
> two_levels_tree <- tempfile()
> write_dataset(mtcars, two_levels_tree, partitioning = c("cyl", "gear"))
> list.files(two_levels_tree, recursive = TRUE)
[1] "cyl=4/gear=3/part-0.parquet" "cyl=4/gear=4/part-0.parquet"
[3] "cyl=4/gear=5/part-0.parquet" "cyl=6/gear=3/part-0.parquet"
[5] "cyl=6/gear=4/part-0.parquet" "cyl=6/gear=5/part-0.parquet"
[7] "cyl=8/gear=3/part-0.parquet" "cyl=8/gear=5/part-0.parquet"
> library(dplyr)

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union

> two_levels_tree_2 <- tempfile()
> mtcars %>% group_by(cyl, gear) %>% write_dataset(two_levels_tree_2)
> list.files(two_levels_tree_2, recursive = TRUE)
[1] "cyl=4/gear=3/part-0.parquet" "cyl=4/gear=4/part-0.parquet"
[3] "cyl=4/gear=5/part-0.parquet" "cyl=6/gear=3/part-0.parquet"
[5] "cyl=6/gear=4/part-0.parquet" "cyl=6/gear=5/part-0.parquet"
[7] "cyl=8/gear=3/part-0.parquet" "cyl=8/gear=5/part-0.parquet"
> two_levels_tree_no_hive <- tempfile()
> mtcars %>% group_by(cyl, gear) %>% write_dataset(two_levels_tree_no_hive, 
+     hive_style = FALSE)
> list.files(two_levels_tree_no_hive, recursive = TRUE)
[1] "4/3/part-0.parquet" "4/4/part-0.parquet" "4/5/part-0.parquet"
[4] "6/3/part-0.parquet" "6/4/part-0.parquet" "6/5/part-0.parquet"
[7] "8/3/part-0.parquet" "8/5/part-0.parquet"

wrt_dt> ## End(Don't show)
wrt_dt> 
wrt_dt> 
wrt_dt> 
> 
> 
(arrow) tdhock@maude-MacBookPro:~/arrow-git/cpp/build(main*)$ 
westonpace commented 1 year ago

write_dataset works (no segfault) if I build libarrow from source,

That is exciting. Do you have any suggestions on how we might be able to improve our packaging?

I suppose someone could use a try_run to automatically set ARROW_SIMD_LEVEL to NONE if the CPU doesn't run some small program that compiles to a true popcnt.

tdhock commented 1 year ago

yes good idea that is probly more robust than grepping lscpu