duckdb / duckdb-r

The duckdb R package
https://r.duckdb.org/
Other
136 stars 23 forks source link

Duckdb 1.0.0-1 doesn't compile on shinyapps.io #203

Closed actuarial-lonewolf closed 1 month ago

actuarial-lonewolf commented 4 months ago

Hi,

I've been juggling with duckdb on various computers and learned just recently that duckdb disk files are not compatible between package versions. As an example, *.duckdb file created under version 0.7.1 can't be read by version 1.0.0 , and vice-versa. I've had previous successful deployments using duckdb 0.7.1.

Decided to end this none-sense version incompatibility and upgrade all my Windows softwares, to R 4.4.1, and reinstall all packages up to date.

Local Installation

install.packages("shiny",dependencies = TRUE)
install.packages("duckdb",dependencies = TRUE)
install.packages("rsconnect",dependencies = TRUE)

Deployment

When I deploy my app to shinyapps.io, the deployment crashes during the phase where shinyapps compiles duckdb 1.0.0-1.

rsconnect::deployApp(appDir = 'C:\\test_duckdb\\', appName = "test_duckdb")

── Preparing for deployment ─────────────────────────────────────────────────────────────────────────────────────────────────── ✔ Re-deploying "test_duckdb" using "server: shinyapps.io / username: XXXXXXX" ℹ Looking up application with id "12361084"... ✔ Found application https://XXXXXXXXX.shinyapps.io/test_duckdb/ ℹ Bundling 1 file: app.R ℹ Capturing R dependencies with renv ✔ Found 32 dependencies ✔ Created 19,166b bundle ℹ Uploading bundle... ✔ Uploaded bundle with id 8902836 ── Deploying to server ──────────────────────────────────────────────────────────────────────────────────────────────────────── Waiting for task: 1440397383 building: Building image: 10813793 building: Fetching packages building: Building package: duckdb

It runs for 20 minutes until the error log shows up. See at the end.


Reprex

using default Shiny app.R template and adding library(duckdb).

library(shiny)
library(duckdb)

# Define UI for application that draws a histogram
ui <- fluidPage(

    # Application title
    titlePanel("Old Faithful Geyser Data"),

    # Sidebar with a slider input for number of bins 
    sidebarLayout(
        sidebarPanel(
            sliderInput("bins",
                        "Number of bins:",
                        min = 1,
                        max = 50,
                        value = 30)
        ),

        # Show a plot of the generated distribution
        mainPanel(
           plotOutput("distPlot")
        )
    )
)

# Define server logic required to draw a histogram
server <- function(input, output) {

    output$distPlot <- renderPlot({
        # generate bins based on input$bins from ui.R
        x    <- faithful[, 2]
        bins <- seq(min(x), max(x), length.out = input$bins + 1)

        # draw the histogram with the specified number of bins
        hist(x, breaks = bins, col = 'darkgray', border = 'white',
             xlab = 'Waiting time to next eruption (in mins)',
             main = 'Histogram of waiting times')
    })
}

# Run the application 
shinyApp(ui = ui, server = server)

sessionInfo

sessionInfo() in local Rstudio:

R version 4.4.1 (2024-06-14 ucrt) Platform: x86_64-w64-mingw32/x64 Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale: [1] LC_COLLATE=French_Canada.utf8 LC_CTYPE=French_Canada.utf8 LC_MONETARY=French_Canada.utf8 LC_NUMERIC=C
[5] LC_TIME=French_Canada.utf8

time zone: America/Toronto tzcode source: internal

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] rsconnect_1.3.1

loaded via a namespace (and not attached): [1] jsonlite_1.8.8 renv_1.0.7 dplyr_1.1.4 compiler_4.4.1 promises_1.3.0 tidyselect_1.2.1
[7] Rcpp_1.0.12 later_1.3.2 fastmap_1.2.0 readxl_1.4.3 mime_0.12 readr_2.1.5
[13] R6_2.5.1 generics_0.1.3 curl_5.2.1 classInt_0.4-10 sf_1.0-16 tibble_3.2.1
[19] units_0.8-5 openssl_2.2.0 shiny_1.8.1.1 DBI_1.2.3 pillar_1.9.0 tzdb_0.4.0
[25] rlang_1.1.4 utf8_1.2.4 httpuv_1.6.15 cli_3.6.3 magrittr_2.0.3 shinyBS_0.61.1
[31] class_7.3-22 digest_0.6.36 grid_4.4.1 xtable_1.8-4 rstudioapi_0.16.0 askpass_1.2.0
[37] hms_1.1.3 lifecycle_1.0.4 vctrs_0.6.5 KernSmooth_2.23-24 proxy_0.4-27 glue_1.7.0
[43] duckdb_1.0.0-1 cellranger_1.1.0 fansi_1.0.6 e1071_1.7-14 purrr_1.0.2 tools_4.4.1
[49] pkgconfig_2.0.3 htmltools_0.5.8.1

Log

... lots of packages being installed ... [2024-07-24T17:32:42.374107970+0000] Installing R package: xfun (0.45)

  • installing to library ‘/usr/lib/R’
  • installing binary package ‘xfun’ ...
  • DONE (xfun) [2024-07-24T17:32:43.069962156+0000] Installing R package: xtable (1.8-4)
  • installing to library ‘/usr/lib/R’
  • installing binary package ‘xtable’ ...
  • DONE (xtable) [2024-07-24T17:32:43.786417543+0000] Installing R package: yaml (2.3.9)
  • installing to library ‘/usr/lib/R’
  • installing binary package ‘yaml’ ...
  • DONE (yaml) [2024-07-24T17:32:44.520825469+0000] Building R package: duckdb (1.0.0-1) /mnt/packages/build /mnt g++ -std=gnu++17 -I"/opt/R/4.4.1/lib/R/include" -DNDEBUG -Iinclude -I../inst/include -DDUCKDB_DISABLE_PRINT -DDUCKDB_R_BUILD -Iduckdb/src/include -Iduckdb/third_party/concurrentqueue -Iduckdb/third_party/fast_float -Iduckdb/third_party/fastpforlib -Iduckdb/third_party/fmt/include -Iduckdb/third_party/fsst -Iduckdb/third_party/httplib -Iduckdb/third_party/hyperloglog -Iduckdb/third_party/jaro_winkler -Iduckdb/third_party/jaro_winkler/details -Iduckdb/third_party/libpg_query -Iduckdb/third_party/libpg_query/include -Iduckdb/third_party/lz4 -Iduckdb/third_party/mbedtls -Iduckdb/third_party/mbedtls/include -Iduckdb/third_party/mbedtls/library -Iduckdb/third_party/miniz -Iduckdb/third_party/pcg -Iduckdb/third_party/re2 -Iduckdb/third_party/skiplist -Iduckdb/third_party/tdigest -Iduckdb/third_party/utf8proc -Iduckdb/third_party/utf8proc/include -Iduckdb/third_party/yyjson/include -Iduckdb/extension/parquet/include -Iduckdb/third_party/parquet -Iduckdb/third_party/thrift -Iduckdb/third_party/lz4 -Iduckdb/third_party/snappy -Iduckdb/third_party/zstd/include -Iduckdb/third_party/mbedtls -Iduckdb/third_party/mbedtls/include -I../inst/include -Iduckdb -DDUCKDB_EXTENSION_PARQUET_LINKED -DDUCKDB_BUILD_LIBRARY -I/usr/local/include
    ... never ending duckdb related stuff ... until ...

b/third_party/zstd/compress/hist.o duckdb/third_party/zstd/compress/huf_compress.o duckdb/third_party/zstd/compress/zstd_compress.o duckdb/third_party/zstd/compress/zstd_compress_literals.o duckdb/third_party/zstd/compress/zstd_compress_sequences.o duckdb/third_party/zstd/compress/zstd_compress_superblock.o duckdb/third_party/zstd/compress/zstd_double_fast.o duckdb/third_party/zstd/compress/zstd_fast.o duckdb/third_party/zstd/compress/zstd_lazy.o duckdb/third_party/zstd/compress/zstd_ldm.o duckdb/third_party/zstd/compress/zstd_opt.o duckdb/third_party/lz4/lz4.o -L/opt/R/4.4.1/lib/R/lib -lR## End Task Log ############################################################################################################### Erreur : Unhandled Exception: child_task=1440384354 child_task_status=error: Unhandled Exception: 599

Any tips?

krlmlr commented 3 months ago

Thanks. Compiling duckdb on shinyapps.io isn't likely to succeed. The binaries on https://p3m.dev/client/ are up to date though:

https://p3m.dev/client/#/repos/cran/packages/duckdb/overview?search=duckdb#package-details

actuarial-lonewolf commented 3 months ago

Fyi, my solution was to (unfortunately) downgrade back to 0.7.1., which does compile on shinyapps.io.

krlmlr commented 3 months ago

Thanks for the heads-up.

actuarial-lonewolf commented 3 months ago

I don't understand why it was closed? It is still a significant issue, moreover given the file reading incompatibility between package versions.

krlmlr commented 3 months ago

Sure, let's discuss.

actuarial-lonewolf commented 1 month ago

Hi, I noticed duckdb is now at version 1.1.0 (Sept 25, 2024). I have tried once again to deploy an app on shinyapps.io, using the updated 1.1.0 duckdb alongside the generic web shiny app template (as in the first post).

Deployment hangs at the Building packages: duckdb After 30 minutes+, it crashes.

Error: Unhandled Exception: child_task=1465514956 child_task_status=error: Unhandled Exception: 599

FYI, I have an issue open at posit, but not much traction there either.

hezibu commented 1 month ago

I have the same problem as well.

krlmlr commented 1 month ago

Thanks. Can you configure Posit Package Manager as the repository before deploying to shinyapps.io?

hezibu commented 1 month ago

If that is done by options(repos = c(CRAN = "https://packagemanager.posit.co/cran/latest")) then can confirm it still doesn't work.

Searching for similar issues in the past brings up this answer from a Posit employee:

The issue with mzR is not one of the time limit, but that while building the package we stop seeing the output of the build, and thus don't see when it completes, and therefore hit the timeout. We have not yet found the root cause, and do not have an estimate for when a solution might be in place. We apologize for the inconvenience this causes and will be sure to announce when we have solved the problem.

gaborcsardi commented 1 month ago

If that is done by options(repos = c(CRAN = "https://packagemanager.posit.co/cran/latest")) then can confirm it still doesn't work.

Is it still compiling duckdb from source? You probably need to set the User-Agent header, e.g.

options(HTTPUserAgent = sprintf("R/%s R (%s)", getRversion(), paste(getRversion(), R.version["platform"], R.version["arch"], R.version["os"])))

Cf. https://docs.posit.co/rspm/admin/serving-binaries/#binary-user-agents

Also, the correct repo address (for Jammy if that's what you are using) is:

options(repos = c(CRAN = "https://packagemanager.posit.co/cran/__linux__/jammy/latest"))

(For other distros see https://packagemanager.posit.co/client/#/ and click on 'Setup'.)

krlmlr commented 1 month ago

Just to clarify, Gábor -- which component would set the HTTPUserAgent option? IIUC, this needs to happen on shinyapps.io, while restoring the deployment's execution environment.

The log shows that some packages are installed as binary (search for *binary* in the output), but duckdb apparently is not. I do wonder why compiling succeeds with 0.7.1, but not with more recent versions too. On the other hand, it looks like compilation shouldn't be done in the first place.

Can someone please post a full deployment log?

Perhaps a workaround: Does it work on Connect Cloud (https://connect.posit.cloud/)?

gaborcsardi commented 1 month ago

I am not sure where you'd set this, I saw you were setting the repos option already, just wanted to point out that it might not be enough.

But if some packages are being installed from binaries already, then all is good in that front. Well, except that duckdb is installed from source.

Is there a duckdb binary available from PPM, btw? Maybe the PPM build fails as well, that's why it is being installed from source.

If you can really set the repos option, then I suggest you set it to a PPM snapshot (that has a binary duckdb), because the latest repo might not have binaries for recently published packages.

Re why 0.7.1 does not fail, the new version either just fails to build on this platform, or it simply requires more memory to compile.

actuarial-lonewolf commented 1 month ago

Hi all, Thanks for your help on this issue. I bring good news. Posit has fixed the issue when deploying to shinyapps.io.

I have tested it with duckdb 1.1.0 and it works!