duckdb / duckdb-r

The duckdb R package
https://r.duckdb.org/
Other
126 stars 24 forks source link

Cannot install extensions postgres_scanner or spatial in MS Windows #23

Closed elysabethpc closed 7 months ago

elysabethpc commented 1 year ago

Hello,

I cannot install some extension in R 4.3.1 in windows. I use duckdb duckdb_0.8.1-3 DBI_1.1.3 . I also tested v0.9.0 but in that version more extensions failed to install.

In v0.8.1 the extensions icu, json, httpfs and fts work ok, but postgres_scanner and spatial fail to install. It looks like them don't exists in the repository for platform windows_amd64_rtools. I tried to install manually from repository for platform windows_amd64 (without _rtools sufix) but they made R Session to crash in RStudio when I load them.

I test same thing in linux amd64 and arm64 and work ok with all extension in version 0.8.1 and 0.9.0

> dbExecute(con,"install postgres_scanner;")
Error: rapi_execute: Failed to run query
Error: HTTP Error: Failed to download extension "postgres_scanner" at URL "http://extensions.duckdb.org/v0.8.1/windows_amd64_rtools/postgres_scanner.duckdb_extension.gz"
Extension "postgres_scanner" is an existing extension.

Are you using a development build? In this case, extensions might not (yet) be uploaded.
> dbExecute(con,"load postgres_scanner;")
Error: rapi_execute: Failed to run query
Error: HTTP Error: Failed to download extension "postgres_scanner" at URL "http://extensions.duckdb.org/v0.8.1/windows_amd64_rtools/postgres_scanner.duckdb_extension.gz"
Extension "postgres_scanner" is an existing extension.

Are you using a development build? In this case, extensions might not (yet) be uploaded.
> dbGetQuery(con,"from duckdb_extensions();") |> as_tibble() |> select(extension_name,loaded,installed)
# A tibble: 14 × 3
   extension_name   loaded installed
   <chr>            <lgl>  <lgl>    
 1 autocomplete     FALSE  FALSE    
 2 fts              TRUE   TRUE     
 3 httpfs           FALSE  TRUE     
 4 icu              TRUE   TRUE     
 5 inet             FALSE  FALSE    
 6 jemalloc         FALSE  FALSE    
 7 json             TRUE   TRUE     
 8 motherduck       FALSE  FALSE    
 9 parquet          TRUE   TRUE     
10 postgres_scanner FALSE  FALSE    
11 spatial          FALSE  FALSE    
12 sqlite_scanner   FALSE  FALSE    
13 tpcds            FALSE  FALSE    
14 tpch             FALSE  FALSE    
> sessionInfo()
R version 4.3.1 (2023-06-16 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=Spanish_Spain.utf8  LC_CTYPE=Spanish_Spain.utf8   
[3] LC_MONETARY=Spanish_Spain.utf8 LC_NUMERIC=C                  
[5] LC_TIME=Spanish_Spain.utf8    

time zone: Europe/Madrid
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] lubridate_1.9.2 forcats_1.0.0   stringr_1.5.0   dplyr_1.1.3     purrr_1.0.2    
 [6] readr_2.1.4     tidyr_1.3.0     tibble_3.2.1    ggplot2_3.4.3   tidyverse_2.0.0
[11] duckdb_0.8.1-3  DBI_1.1.3      

loaded via a namespace (and not attached):
 [1] tcltk_4.3.1          writexl_1.4.2        rlang_1.1.1          magrittr_2.0.3      
 [5] shinydashboard_0.7.2 snakecase_0.11.1     compiler_4.3.1       vctrs_0.6.3         
 [9] pkgconfig_2.0.3      shape_1.4.6          fastmap_1.1.1        backports_1.4.1     
[13] ellipsis_0.3.2       utf8_1.2.3           shinyjqui_0.4.1      promises_1.2.1      
[17] rmarkdown_2.25       tzdb_0.4.0           nloptr_2.0.3         missMDA_1.18        
[21] bit_4.0.5            xfun_0.40            glmnet_4.1-8         jomo_2.7-6          
[25] jsonlite_1.8.7       flashClust_1.01-2    later_1.3.1          pan_1.9             
[29] broom_1.0.5          parallel_4.3.1       cluster_2.1.4        R6_2.5.1            
[33] FactoInvestigate_1.8 stringi_1.7.12       boot_1.3-28.1        rpart_4.1.19        
[37] estimability_1.4.1   Rcpp_1.0.11          assertthat_0.2.1     iterators_1.0.14    
[41] knitr_1.44           base64enc_0.1-3      httpuv_1.6.11        Matrix_1.6-1.1      
[45] splines_4.3.1        nnet_7.3-19          timechange_0.2.0     tidyselect_1.2.0    
[49] rstudioapi_0.15.0    yaml_2.3.7           miniUI_0.1.1.1       doParallel_1.0.17   
[53] codetools_0.2-19     lattice_0.21-8       withr_2.5.0          shiny_1.7.5         
[57] evaluate_0.21        survival_3.5-7       pillar_1.9.0         mice_3.16.0         
[61] corrplot_0.92        DT_0.29              foreach_1.5.2        generics_0.1.3      
[65] hms_1.1.3            munsell_0.5.0        scales_1.2.1         minqa_1.2.6         
[69] xtable_1.8-4         leaps_3.1            glue_1.6.2           janitor_2.2.0       
[73] emmeans_1.8.8        scatterplot3d_0.3-44 tools_4.3.1          lme4_1.1-34         
[77] colourpicker_1.3.0   mvtnorm_1.2-3        grid_4.3.1           colorspace_2.1-0    
[81] nlme_3.1-163         repr_1.1.6           cli_3.6.1            fansi_1.0.4         
[85] arrow_13.0.0.1       gtable_0.3.4         digest_0.6.33        ggrepel_0.9.3       
[89] FactoMineR_2.8       htmlwidgets_1.6.2    skimr_2.1.5          htmltools_0.5.6     
[93] Factoshiny_2.4       lifecycle_1.0.3      multcompView_0.1-9   mitml_0.4-5         
[97] mime_0.12            bit64_4.0.5          MASS_7.3-60         
> list.files("xxx\\AppData\\Roaming\\R\\data\\R\\duckdb\\.duckdb\\extensions\\v0.8.1\\windows_amd64_rtools\\")
[1] "fts.duckdb_extension"    "httpfs.duckdb_extension" "icu.duckdb_extension"   
[4] "json.duckdb_extension" 

Thank you

Tmonster commented 1 year ago

Hi elysabethpc,

There are currently some known issues with installing extensions on windows. We're working on it! I think https://github.com/duckdb/duckdb/issues/8243 is probably related as well

krlmlr commented 11 months ago

Is this still an issue?

arthurgailes commented 11 months ago

Is this still an issue?

Yes.

dpprdan commented 7 months ago

The spatial extension is installable with the latest dev version (r-universe), postgres not yet.

library(duckdb)
#> Loading required package: DBI

ddb_con <- dbConnect(duckdb())

dbExecute(ddb_con, "INSTALL spatial; LOAD spatial;")
#> [1] 0
dbExecute(ddb_con, "INSTALL postgres; LOAD postgres;")
#> Error: rapi_prepare: Failed to execute statement INSTALL postgres; LOAD postgres;
#> Error: HTTP Error: Failed to download extension "postgres_scanner" at URL "http://extensions.duckdb.org/v0.10.0/windows_amd64_rtools/postgres_scanner.duckdb_extension.gz"
#> Extension "postgres_scanner" is an existing extension.

dbDisconnect(ddb_con)
Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.3.3 (2024-02-29 ucrt) #> os Windows 11 x64 (build 22631) #> system x86_64, mingw32 #> ui RTerm #> language en #> collate German_Germany.utf8 #> ctype German_Germany.utf8 #> tz Europe/Berlin #> date 2024-03-08 #> pandoc 3.1.12.2 @ C:/PROGRA~1/Pandoc/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> cli 3.6.2 2023-12-11 [1] CRAN (R 4.3.2) #> DBI * 1.2.2 2024-02-16 [1] CRAN (R 4.3.2) #> digest 0.6.34 2024-01-11 [1] CRAN (R 4.3.2) #> duckdb * 0.9.99.9000 2024-03-07 [1] https://duckdb.r-universe.dev (R 4.3.3) #> evaluate 0.23 2023-11-01 [1] CRAN (R 4.3.2) #> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0) #> fs 1.6.3 2023-07-20 [1] CRAN (R 4.3.1) #> glue 1.7.0 2024-01-09 [1] CRAN (R 4.3.2) #> htmltools 0.5.7 2023-11-03 [1] CRAN (R 4.3.2) #> knitr 1.45 2023-10-30 [1] CRAN (R 4.3.1) #> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.3.2) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0) #> purrr 1.0.2 2023-08-10 [1] CRAN (R 4.3.1) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.3.0) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.3.0) #> R.oo 1.26.0 2024-01-24 [1] CRAN (R 4.3.2) #> R.utils 2.12.3 2023-11-18 [1] CRAN (R 4.3.2) #> reprex 2.1.0 2024-01-11 [1] CRAN (R 4.3.2) #> rlang 1.1.3 2024-01-10 [1] CRAN (R 4.3.2) #> rmarkdown 2.26 2024-03-05 [1] CRAN (R 4.3.3) #> rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.1) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0) #> styler 1.10.2 2023-08-29 [1] CRAN (R 4.3.1) #> vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.3.2) #> withr 3.0.0 2024-01-16 [1] CRAN (R 4.3.2) #> xfun 0.42 2024-02-08 [1] CRAN (R 4.3.2) #> yaml 2.3.8 2023-12-11 [1] CRAN (R 4.3.2) #> #> [1] C:/Users/Daniel.AK-HAMBURG/AppData/Local/R/win-library/4.3 #> [2] C:/Program Files/R/R-4.3.3/library #> #> ────────────────────────────────────────────────────────────────────────────── ```

Apparently the extensions have to be built with the msys2/mingw flavor that R uses.

elysabethpc commented 7 months ago

I got postgres_scanner build and run in Windows with rtools 4.3 doing this:

set PATH=C:\rtools43\x86_64-w64-mingw32.static.posix\bin;C:\rtools43\usr\bin;%PATH%
git clone https://github.com/duckdb/postgres_scanner.git
cd postgres_scanner\
git submodule init
git pull --recurse-submodules
cd duckdb
git checkout tags/v0.10.0
cd ..
git checkout 2eb532b17cdef1b
git apply ..\postgres_scanner_0.10.0_rtools_4.3.patch

# this will fail
make

patch -p1 < ..\win32_port.patch 

# this should be ok
make

strip build/release/extension/postgres_scanner/postgres_scanner.duckdb_extension
mkdir C:\Users\user\AppData\Roaming\R\data\R\duckdb\extensions\v0.10.0\windows_amd64_rtools
cp build\release\extension\postgres_scanner\postgres_scanner.duckdb_extension C:\Users\user\AppData\Roaming\R\data\R\duckdb\extensions\v0.10.0\windows_amd64_rtools

in postgres

# in WSL postgres server
sudo -u postgres psql
postgres=# create database basedatos owner dev;
\q
psql -U dev -d basedatos
basedatos=> create table tabla(col1 int, col2 text, col3 timestamp with time zone);
basedatos=> insert into tabla values(5,'prueba', '2024-03-17 09:50:50' AT TIME ZONE 'Europe/Madrid');
\q

Test

"c:\Program Files\R\R-4.3.3\bin\Rscript.exe" ..\postgres_scanner_test.R

in R:

library(DBI)
library(duckdb)
library(tidyverse)

con <- dbConnect(duckdb::duckdb(config=list('allow_unsigned_extensions'='true')))

dbExecute(con,"load postgres_scanner;")
dbGetQuery(con,"from duckdb_extensions();") |> tibble() |>
  filter(loaded==TRUE)

dbGetQuery(con,"select * from postgres_scan('dbname=basedatos user=dev password=password host=127.0.0.1', 'public', 'tabla');") |> glimpse()
dbExecute(con,"create schema abc;")
dbExecute(con,"CALL postgres_attach('dbname=basedatos user=dev password=password host=127.0.0.1', source_schema='public', sink_schema='abc');")

tbl(con,sql("SELECT * from information_schema.tables"))
tbl(con,"abc.tabla")

dbExecute(con,"install icu;")
dbExecute(con,"install json;")
dbExecute(con,"install fts;")
dbExecute(con,"install httpfs;")
dbExecute(con,"install spatial;")
dbExecute(con,"load icu;")
dbExecute(con,"load json;")
dbExecute(con,"load fts;")
dbExecute(con,"load httpfs;")
dbExecute(con,"load spatial;")

dbGetQuery(con,"from duckdb_extensions();") |> tibble() |>
  filter(loaded==TRUE)

# tz lost
tbl(con,"abc.tabla") |> pull(col3)

tbl(con,"abc.tabla") |>
  collect() |> 
  mutate(col3=with_tz(col3,"Europe/Madrid")) |> 
  pull(col3)

postgres_scanner_0.10.0_rtools_4.3.patch postgres_scanner_test.txt win32_port.patch

krlmlr commented 7 months ago

Thanks! I filed an issue upstream.

clickleasedbarney commented 3 months ago

elysabethpc:

by chance do you have an update to the 3/17/24 comment above? i really need to get postgres_scanner working with duckdb in R but am using all the latest versions of duckdb and R, and cannot see how to get the patches to work...if you can guide or have examples of how to update thru the versions it would be much appreciated. Thanks.