njtierney / geotargets

Targets extensions for geospatial data
https://njtierney.github.io/geotargets/
Other
49 stars 4 forks source link

tests with multiple workers #49

Closed Aariq closed 2 months ago

Aariq commented 3 months ago

I think it's important to add tests running pipelines with multiple workers, since that is when the marshaling/unmarshaling of R objects comes into play.

Aariq commented 2 months ago

E.g. in #45 tests without multiple workers pass, but tests with multiple workers fail

njtierney commented 2 months ago

@Aariq I couldn't see tests in #45 with multiple workers?

njtierney commented 2 months ago

https://github.com/njtierney/demo-geotargets/blob/main/_targets.R#L57-L67 demonstrates this failure.

Running:

tar_make(country_shapes)

I get

\
Attaching package: ‘arrow’

The following object is masked from ‘package:utils’:

    timestamp

Loading required package: terra
terra 1.7.71
/
Attaching package: ‘terra’

The following object is masked from ‘package:arrow’:

    buffer

▶ dispatched target some_countries
● completed target some_countries [1.555 seconds]
▶ dispatched branch country_shapes_3f2db3d69162e956
▶ dispatched branch country_shapes_e283ec3d863c309b
▶ dispatched branch country_shapes_22eb5cc8ead488b2
● completed branch country_shapes_3f2db3d69162e956 [44.895 seconds]
▶ dispatched branch country_shapes_727718e808a32bd7
● completed branch country_shapes_e283ec3d863c309b [44.658 seconds]
▶ dispatched branch country_shapes_863d9f12a8c14c21
● completed branch country_shapes_22eb5cc8ead488b2 [45.125 seconds]
▶ dispatched branch country_shapes_32e7a6f900e1d4b6
● completed branch country_shapes_863d9f12a8c14c21 [53.13 seconds]
● completed branch country_shapes_32e7a6f900e1d4b6 [53.899 seconds]
✖ errored branch country_shapes_727718e808a32bd7
✖ errored pipeline [1.987 minutes]
Warning message:
[writeVector] nothing to write 
Error:
! Error running targets::tar_make()
Error messages: targets::tar_meta(fields = error, complete_only = TRUE)
Debugging guide: https://books.ropensci.org/targets/debugging.html
How to ask for help: https://books.ropensci.org/targets/help.html
Last error message:
    _store_ missing files: _targets/objects/country_shapes_727718e808a32bd7
Last error traceback:
    No traceback available.

Although perhaps this is a slightly contrived usage of pattern, since this function cgaz_countries is already vectorised...I think it should still work in principle?

Aariq commented 2 months ago

https://github.com/njtierney/demo-geotargets/blob/main/_targets.R#L57-L67 demonstrates this failure.

Does this still fail with targets 1.7.0?

njtierney commented 2 months ago

Yes, sorry, I should have taken the time to write out a proper reprex:

library(targets)
tar_dir({ # tar_dir() runs code from a temporary directory.
  tar_script({
    library(geotargets)
    # from hypertidy/sds
    library(sds)
    library(terra)
    ## demonstration using many countries and multiple workers
    cgaz_country <- function(country_name){
      cgaz_source <- CGAZ()
      cgaz_query <- CGAZ_sql(country_name)

      v <- vect(
        x = cgaz_source,
        query = cgaz_query
      )

      v
    }
    list(
      tar_target(
        some_countries,
        countrycode::codelist$iso3c[1:6]
      ),

      tar_terra_vect(
        country_shapes,
        cgaz_country(some_countries),
        pattern = map(some_countries)
      )
    )
  })

  tar_make(country_shapes)
  tar_load(country_shapes)
})
#> terra 1.7.71
#> ▶ dispatched target some_countries
#> ● completed target some_countries [0.021 seconds]
#> ▶ dispatched branch country_shapes_3f2db3d69162e956
#> ● completed branch country_shapes_3f2db3d69162e956 [35.993 seconds]
#> ▶ dispatched branch country_shapes_e283ec3d863c309b
#> ● completed branch country_shapes_e283ec3d863c309b [28.566 seconds]
#> ▶ dispatched branch country_shapes_22eb5cc8ead488b2
#> ● completed branch country_shapes_22eb5cc8ead488b2 [27.886 seconds]
#> ▶ dispatched branch country_shapes_727718e808a32bd7
#> ✖ errored branch country_shapes_727718e808a32bd7
#> ✖ errored pipeline [2.284 minutes]
#> Warning message:
#> [writeVector] nothing to write
#> Error:
#> ! Error running targets::tar_make()
#> Error messages: targets::tar_meta(fields = error, complete_only = TRUE)
#> Debugging guide: https://books.ropensci.org/targets/debugging.html
#> How to ask for help: https://books.ropensci.org/targets/help.html
#> Last error message:
#>     _store_ missing files: _targets/objects/country_shapes_727718e808a32bd7
#> Last error traceback:
#>     No traceback available.

Created on 2024-04-24 with reprex v2.1.0

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.3.3 (2024-02-29) #> os macOS Sonoma 14.3.1 #> system aarch64, darwin20 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz Australia/Hobart #> date 2024-04-24 #> pandoc 3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> backports 1.4.1 2021-12-13 [1] CRAN (R 4.3.0) #> base64url 1.4 2018-05-14 [1] CRAN (R 4.3.0) #> callr 3.7.6 2024-03-25 [1] CRAN (R 4.3.1) #> cli 3.6.2 2023-12-11 [1] CRAN (R 4.3.1) #> codetools 0.2-20 2024-03-31 [2] CRAN (R 4.3.1) #> data.table 1.15.4 2024-03-30 [1] CRAN (R 4.3.1) #> digest 0.6.35 2024-03-11 [1] CRAN (R 4.3.1) #> evaluate 0.23 2023-11-01 [1] CRAN (R 4.3.1) #> fansi 1.0.6 2023-12-08 [1] CRAN (R 4.3.1) #> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0) #> fs 1.6.3 2023-07-20 [1] CRAN (R 4.3.0) #> glue 1.7.0 2024-01-09 [1] CRAN (R 4.3.1) #> htmltools 0.5.8.1 2024-04-04 [1] CRAN (R 4.3.1) #> igraph 2.0.3 2024-03-13 [1] CRAN (R 4.3.1) #> knitr 1.45 2023-10-30 [1] CRAN (R 4.3.1) #> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.3.1) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0) #> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0) #> processx 3.8.4 2024-03-16 [1] CRAN (R 4.3.1) #> ps 1.7.6 2024-01-18 [1] CRAN (R 4.3.1) #> purrr 1.0.2 2023-08-10 [1] CRAN (R 4.3.0) #> R.cache 0.16.0 2022-07-21 [2] CRAN (R 4.3.0) #> R.methodsS3 1.8.2 2022-06-13 [2] CRAN (R 4.3.0) #> R.oo 1.26.0 2024-01-24 [2] CRAN (R 4.3.1) #> R.utils 2.12.3 2023-11-18 [2] CRAN (R 4.3.1) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0) #> reprex 2.1.0 2024-01-11 [2] CRAN (R 4.3.1) #> rlang 1.1.3 2024-01-10 [1] CRAN (R 4.3.1) #> rmarkdown 2.26 2024-03-05 [1] CRAN (R 4.3.1) #> rstudioapi 0.16.0 2024-03-24 [1] CRAN (R 4.3.1) #> secretbase 0.4.0 2024-04-04 [1] CRAN (R 4.3.1) #> sessioninfo 1.2.2 2021-12-06 [2] CRAN (R 4.3.0) #> styler 1.10.3 2024-04-07 [2] CRAN (R 4.3.1) #> targets * 1.7.0 2024-04-17 [1] CRAN (R 4.3.1) #> tibble 3.2.1 2023-03-20 [1] CRAN (R 4.3.0) #> tidyselect 1.2.1 2024-03-11 [1] CRAN (R 4.3.1) #> utf8 1.2.4 2023-10-22 [1] CRAN (R 4.3.1) #> vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.3.1) #> withr 3.0.0 2024-01-16 [1] CRAN (R 4.3.1) #> xfun 0.43 2024-03-25 [1] CRAN (R 4.3.1) #> yaml 2.3.8 2023-12-11 [1] CRAN (R 4.3.1) #> #> [1] /Users/nick/Library/R/arm64/4.3/library #> [2] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library #> #> ────────────────────────────────────────────────────────────────────────────── ```
Aariq commented 2 months ago

That reprex doesn't run multiple workers, so I think it is off topic. A particular branch errors in a way that suggest that perhaps there is no data for that particular country. I have the start of some tests with multiple workers. I can open a draft PR later today.