duckdb / duckdb_spatial

MIT License
478 stars 35 forks source link

Spatial extension not available for Windows? #158

Closed cboettig closed 7 months ago

cboettig commented 1 year ago

Apologies if I'm reporting this in the wrong place or it is already documented, but it appears the extension is not available for Windows (at least when attempting to install from R)?

INSTALL "spatial" is failing in my tests on Windows platforms while passing on all other platforms. Specifically, it tries to download from http://extensions.duckdb.org/v0.9.1/windows_amd64_rtools/spatial.duckdb_extension.gz which is not reachable. See logs: https://github.com/cboettig/duckdbfs/actions/runs/6647184214/job/18062119997#step:6:181

Maxxen commented 1 year ago

Hi! Thanks for reporting this issue!

The extension is available for windows here http://extensions.duckdb.org/v0.9.1/windows_amd64/spatial.duckdb_extension.gz,

But we don't currently build "out of tree"-extensions (the extensions that don't reside in the main duckdb repository) for the R client, which iirc requires some special compilation magic.

cboettig commented 1 year ago

Thanks @Maxxen ! ok that makes sense I think but I don't entirely follow. currently, the R version of duckdb tries to download from http://extensions.duckdb.org/v0.9.1/windows_amd64_rtools/spatial.duckdb_extension.gz, which doesn't exist for the reasons you say. That feels like a bug. Are you suggesting that

(a) duckdb code should be altered so that it installs from http://extensions.duckdb.org/v0.9.1/windows_amd64/spatial.duckdb_extension.gz instead of _rtools flavor, or would that not work with R users either (due to missing compiler situation?) (b) duckdb should throw an error message that says "spatial extension not available for Windows R users" or something, rather than the current error? (c) nothing should change in duckdb, but one day we might get http://extensions.duckdb.org/v0.9.1/windows_amd64/spatial.duckdb_extension.gz ?

I recall that formerly the httpfs extension wasn't working for Windows R users either (also due to same challenges with compilers available in rtools iirc), and both packages on CRAN that use duckdb and some popular blog posts still describe duckdb as not supporting httpfs / URL access for R users on Windows. However I also noticed that this is no longer true, and Windows users can now use httpfs, but still cannot use the spatial extension.

Just cc'ing the amazing @jeroen who helps maintain rtools and might have some insights in how to get this extension to build for Windows when he has a chance?

Maxxen commented 1 year ago

The httpfsextension is distributed as part of the core DuckDB build pipeline (its an "in-tree" extension, it resides inside the main DuckDB repository), and afaik we don't build any out-of-tree extensions for R, not because its impossible, its just that we're still figuring out how out-of-tree extensions should be built, synced and distributed.

I assume that we differentiate between windows and _rtools for a reason but I can't remember why, maybe its just to comply with CRAN guidelines? I guess I would try with the normal windows build and see if it works? But since I don't work with R myself my answer is probably a combination of b) and c), although I don't know if the error can be more descriptive than it already is.

Maybe @carlopi or @samansmink can chime in.

cboettig commented 1 year ago

Cool, thanks! AFAIK, the R client merely sends the SQL command string INSTALL 'spatial'; out over the duckdb connection, so I don't have much visibility into how duckdb even 'knows' that it's running from inside R and where the logic resides that lead it to choose the http://extensions.duckdb.org/v0.9.1/windows_amd64_rtools/spatial.duckdb_extension.gz path in the first place. I can probably dig into the instructions for manual installs of extensions instead, but I don't have a windows machine handy to test with. The INSTALL 'spatial'; command works just fine from R for installing the extension on any other platform I've tested (mac, linux, arm architectures), I think it's just windows that has troubles. I wonder if the _rtools thing dates back to the reason that httpfs wasn't on windows either, and maybe is no longer necessary?

mattsams89 commented 11 months ago

As a follow-up on the manual installation front, downloading the windows version of the extension, decompressing it, and installing it does work up through installation. When attempting to load the extension, though, both R and RStudio immediately crash once LOAD spatial is called. This happens whether the DB is in-memory or persistent. E.g.,

# https://extensions.duckdb.org/v0.9.2/windows_amd64/spatial.duckdb_extension.gz

reticulate::repl_python()

import gzip
import shutil

# file path is truncated
with gzip.open('.../spatial.duckdb_extension.gz','rb') as f_in:
    with open('.../spatial.duckdb_extension', 'wb') as f_out:
        shutil.copyfileobj(f_in, f_out)

exit

library(duckdb)

con <- dbConnect(duckdb())

dbExecute(con, "INSTALL '.../spatial.duckdb_extension'")
dbExecute(con, "LOAD spatial")

# Also fails with dbExecute(con, "LOAD '.../spatial.duckdb_extension'")

image

Manually building the extension is a little beyond my knowledge base, so hopefully this is useful info!

sessionInfo() just in case:

R version 4.3.0 (2023-04-21 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United States.utf8    LC_MONETARY=English_United States.utf8 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] duckdb_0.9.1-1 DBI_1.1.3     

loaded via a namespace (and not attached):
 [1] ggdist_3.3.0           utf8_1.2.3             generics_0.1.3         stringi_1.7.12         lattice_0.21-8         lme4_1.1-33            hms_1.1.3             
 [8] magrittr_2.0.3         grid_4.3.0             blob_1.2.4             jsonlite_1.8.4         Matrix_1.6-2           ggrepel_0.9.3          ggeffects_1.2.2       
[15] ggthemes_4.2.4         purrr_1.0.1            fansi_1.0.4            tidytable_0.10.1       box_1.1.3              scales_1.2.1           cli_3.6.1             
[22] rlang_1.1.0            bit64_4.0.5            munsell_0.5.0          splines_4.3.0          withr_2.5.0            tools_4.3.0            nloptr_2.0.3          
[29] odbc_1.3.4             minqa_1.2.5            dplyr_1.1.2            colorspace_2.1-0       ggplot2_3.4.2          boot_1.3-28.1          reticulate_1.34.0.9000
[36] png_0.1-8              vctrs_0.6.2            R6_2.5.1               lifecycle_1.0.3        stringr_1.5.0          bit_4.0.5              MASS_7.3-58.4         
[43] pkgconfig_2.0.3        pillar_1.9.0           gtable_0.3.3           data.table_1.14.8      glue_1.6.2             Rcpp_1.0.10            tibble_3.2.1          
[50] tidyselect_1.2.0       rstudioapi_0.14        farver_2.1.1           nlme_3.1-162           compiler_4.3.0         distributional_0.3.2
cboettig commented 9 months ago

@Maxxen @eitsupi pointed out to me that DuckDB for Windows other than R uses the MSVC ABI and only R uses the GNU ABI. The other R extensions are just being built by a github action https://github.com/duckdb/duckdb/blob/a55f89cd9e956b3e575532e058c230461799ac64/.github/workflows/R.yml#L29-L69. Would it be possible to use that as a basis for building the spatial extension for R windows users as well? (I'd offer to do a PR but given that it needs S3 secrets it might be easier for someone from the duckdb team to do so; also not sure which repo this would belong to).

tiernanmartin commented 9 months ago

I noticed that https://github.com/duckdb/duckdb/pull/10204 was merged, so I thought I'd check to see whether the spatial extension worked with the nightly build. Unfortunately, I am still unable to install the spatial extension (see the reprex below). Am I correct to think that https://github.com/duckdb/duckdb/pull/10204 should have solved this issue?

Reprex ``` r # remove.packages("duckdb") install.packages('duckdb', repos=c('https://duckdb.r-universe.dev', 'https://cloud.r-project.org')) #> Installing package into 'C:/Users/tiern/AppData/Local/R/win-library/4.3' #> (as 'lib' is unspecified) #> package 'duckdb' successfully unpacked and MD5 sums checked #> #> The downloaded binary packages are in #> C:\Users\tiern\AppData\Local\Temp\RtmpiwZSeV\downloaded_packages packageVersion("duckdb") #> [1] '0.9.2.1' con = DBI::dbConnect(duckdb::duckdb(), ":memory:") DBI::dbExecute(con, "INSTALL spatial;") #> Error: rapi_execute: Failed to run query #> Error: HTTP Error: Failed to download extension "spatial" at URL "http://extensions.duckdb.org/2414840843/windows_amd64_rtools/spatial.duckdb_extension.gz" #> Extension "spatial" is an existing extension. #> #> Are you using a development build? In this case, extensions might not (yet) be uploaded. DBI::dbGetQuery(con, "FROM duckdb_extensions()") |> tibble::tibble() |> print(n=Inf) #> # A tibble: 22 × 6 #> extension_name loaded installed install_path description aliases #> #> 1 arrow FALSE FALSE "" A zero-copy data inte… #> 2 autocomplete FALSE FALSE "" Adds support for auto… #> 3 aws FALSE FALSE "" Provides features tha… #> 4 azure FALSE FALSE "" Adds a filesystem abs… #> 5 excel FALSE FALSE "" Adds support for Exce… #> 6 fts FALSE FALSE "" Adds support for Full… #> 7 httpfs FALSE FALSE "" Adds support for read… #> 8 iceberg FALSE FALSE "" Adds support for Apac… #> 9 icu FALSE FALSE "" Adds support for time… #> 10 inet FALSE FALSE "" Adds support for IP-r… #> 11 jemalloc FALSE FALSE "" Overwrites system all… #> 12 json FALSE FALSE "" Adds support for JSON… #> 13 motherduck FALSE FALSE "" Enables motherduck in… #> 14 mysql_scanner FALSE FALSE "" Adds support for conn… #> 15 parquet TRUE TRUE "(BUILT-IN)" Adds support for read… #> 16 postgres_scanner FALSE FALSE "" Adds support for conn… #> 17 spatial FALSE FALSE "" Geospatial extension … #> 18 sqlite_scanner FALSE FALSE "" Adds support for read… #> 19 substrait FALSE FALSE "" Adds support for the … #> 20 tpcds FALSE FALSE "" Adds TPC-DS data gene… #> 21 tpch FALSE FALSE "" Adds TPC-H data gener… #> 22 visualizer FALSE FALSE "" Creates an HTML-based… ``` Created on 2024-01-23 with [reprex v2.0.2](https://reprex.tidyverse.org)
Session info ``` r sessionInfo() #> R version 4.3.2 (2023-10-31 ucrt) #> Platform: x86_64-w64-mingw32/x64 (64-bit) #> Running under: Windows 11 x64 (build 22631) #> #> Matrix products: default #> #> #> locale: #> [1] LC_COLLATE=English_United States.utf8 #> [2] LC_CTYPE=English_United States.utf8 #> [3] LC_MONETARY=English_United States.utf8 #> [4] LC_NUMERIC=C #> [5] LC_TIME=English_United States.utf8 #> #> time zone: America/Los_Angeles #> tzcode source: internal #> #> attached base packages: #> [1] stats graphics grDevices utils datasets methods base #> #> loaded via a namespace (and not attached): #> [1] vctrs_0.6.4 cli_3.6.1 knitr_1.42 rlang_1.1.1 #> [5] xfun_0.39 DBI_1.1.3 purrr_1.0.2 styler_1.10.2 #> [9] glue_1.6.2 htmltools_0.5.5 fansi_1.0.5 rmarkdown_2.21 #> [13] R.cache_0.16.0 evaluate_0.21 tibble_3.2.1 fastmap_1.1.1 #> [17] yaml_2.3.7 lifecycle_1.0.3 duckdb_0.9.2-1 compiler_4.3.2 #> [21] fs_1.6.2 pkgconfig_2.0.3 rstudioapi_0.14 R.oo_1.25.0 #> [25] R.utils_2.12.3 digest_0.6.33 utf8_1.2.4 reprex_2.0.2 #> [29] pillar_1.9.0 magrittr_2.0.3 R.methodsS3_1.8.2 tools_4.3.2 #> [33] withr_2.5.2 ```
samansmink commented 9 months ago

@tiernanmartin in that PR only the sqlite extension was enabled because I didn't get any of the others to compile easily on the windows rtools environment. It should probably be possible to get most extensions working, including spatial, but we haven't gotten around to it yet!

krlmlr commented 7 months ago

Chances are this works better now (with duckdb 0.10.0), can you confirm?

carlopi commented 7 months ago

As hinted by @krlmlr: spatial extension is available on ALL platforms for duckdb v0.10.0.

This is due to combined work from @samansmink (setting up infrastructure) and @Maxxen (adapting spatial to the tooling).

windows_amd64_rtools's spatial build for v0.10.1 has not being uploaded (yet) since it needs some extra manual steps, but that should come out in the next days (and the manual steps are also going away).

carlopi commented 7 months ago

duckdb_spatial is now available on all 10 supported platforms for v0.10.0 and v0.10.1. A PR is up to have also rtools builds be generated by main CI, so it will always be up.

I think this can be closed.

krlmlr commented 7 months ago

@carlopi: Can you please point me to that PR?