Open goldingn opened 7 months ago
As a workaround @njtierney is there any way to not set a device via compute_options
in calculate?
Never mind - I can workaround for now by forcing to compute on GPU! Should have thought before sending :)
It looks like TFP uses different Poisson sampling algorithms on GPU vs CPU. If we determine tis is only an issue in Poisson, we could manually encode the sampling method for the greta poisson distribution to one that is safer
Ah glad to hear that forcing compute on GPU worked! But also that's a bit spooky that GPU vs CPU causes some strange behaviour.
This is causing me issues in a model using greta.dynamics - when the dynamics lead to a numerical overflow, the model hangs forever.
That model code is too convoluted to post here, but I've narrowed it down to a interaction between TFP's sampling, NaNs in tensors, and the
tf$device
context used in calculate. This may be a TensorFlow or TFP issue, but it would be great if we could find a solution or at least a workaround.Here's a reprex using greta code with a custom op. Note the op itself is not the problem, it's just a way to relaiably create NaNs in underlying tensors. greta should ideally be robust to NaNs sneaking into tensorflow code:
Created on 2024-03-03 with reprex v2.0.2
Through some painful debugging (including a bisection search through the various levels of code called by calculate, with each negative borking my session), I've manage to create the following equivalent reprex using tensorflow code (note: loading greta first despite no greta code to make sure we have the same TF, TFP installations loaded):
Created on 2024-03-03 with reprex v2.0.2
The tf$device stuff is used in the greta
tf2-poke-tf-fun
branch here: https://github.com/greta-dev/greta/blob/tf2-poke-tf-fun/R/calculate.R#L163 The rest of the TF code matches that called by the dag inside calculateHere's my sessioninfo
> devtools::session_info() ─ Session info ─────────────────────────────────────── setting value version R version 4.2.2 (2022-10-31) os macOS Ventura 13.1 system aarch64, darwin20 ui RStudio language (EN) collate en_AU.UTF-8 ctype en_AU.UTF-8 tz Australia/Perth date 2024-03-03 rstudio 2023.06.2+561 Mountain Hydrangea (desktop) pandoc 3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown) ─ Packages ─────────────────────────────────────────── package * version date (UTC) lib source base64enc 0.1-3 2015-07-28 [1] CRAN (R 4.2.0) cachem 1.0.7 2023-02-24 [1] CRAN (R 4.2.0) callr 3.7.3 2022-11-02 [1] CRAN (R 4.2.0) cli 3.6.1 2023-03-23 [1] CRAN (R 4.2.0) clipr 0.8.0 2022-02-22 [1] CRAN (R 4.2.0) coda 0.19-4 2020-09-30 [1] CRAN (R 4.2.0) codetools 0.2-18 2020-11-04 [1] CRAN (R 4.2.2) crayon 1.5.2 2022-09-29 [1] CRAN (R 4.2.0) devtools 2.4.5 2022-10-11 [1] CRAN (R 4.2.0) digest 0.6.31 2022-12-11 [1] CRAN (R 4.2.0) ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.2.0) evaluate 0.20 2023-01-17 [1] CRAN (R 4.2.0) fansi 1.0.4 2023-01-22 [1] CRAN (R 4.2.0) fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.2.0) fs 1.6.1 2023-02-06 [1] CRAN (R 4.2.0) future 1.33.0 2023-07-01 [1] CRAN (R 4.2.0) globals 0.16.2 2022-11-21 [1] CRAN (R 4.2.0) glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.0) greta * 0.4.3.9000 2023-11-13 [1] local here 1.0.1 2020-12-13 [1] CRAN (R 4.2.0) hms 1.1.2 2022-08-19 [1] CRAN (R 4.2.0) htmltools 0.5.4 2022-12-07 [1] CRAN (R 4.2.0) htmlwidgets 1.6.1 2023-01-07 [1] CRAN (R 4.2.0) httpuv 1.6.9 2023-02-14 [1] CRAN (R 4.2.0) jsonlite 1.8.4 2022-12-06 [1] CRAN (R 4.2.0) knitr 1.42 2023-01-25 [1] CRAN (R 4.2.0) later 1.3.0 2021-08-18 [1] CRAN (R 4.2.0) lattice 0.22-5 2023-10-24 [1] CRAN (R 4.2.0) lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.2.0) listenv 0.9.0 2022-12-16 [1] CRAN (R 4.2.0) magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0) Matrix 1.6-1 2023-08-14 [1] CRAN (R 4.2.2) memoise 2.0.1 2021-11-26 [1] CRAN (R 4.2.0) mime 0.12 2021-09-28 [1] CRAN (R 4.2.0) miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.2.0) parallelly 1.34.0 2023-01-13 [1] CRAN (R 4.2.0) pillar 1.9.0 2023-03-22 [1] CRAN (R 4.2.0) pkgbuild 1.4.0 2022-11-27 [1] CRAN (R 4.2.0) pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.0) pkgload 1.3.2 2022-11-16 [1] CRAN (R 4.2.0) png 0.1-8 2022-11-29 [1] CRAN (R 4.2.0) prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.2.0) processx 3.8.0 2022-10-26 [1] CRAN (R 4.2.0) profvis 0.3.8 2023-05-02 [1] CRAN (R 4.2.0) progress 1.2.2 2019-05-16 [1] CRAN (R 4.2.0) promises 1.2.0.1 2021-02-11 [1] CRAN (R 4.2.0) ps 1.7.2 2022-10-26 [1] CRAN (R 4.2.0) purrr 1.0.1 2023-01-10 [1] CRAN (R 4.2.0) R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.0) Rcpp 1.0.11 2023-07-06 [1] CRAN (R 4.2.0) remotes 2.4.2 2021-11-30 [1] CRAN (R 4.2.0) reprex 2.0.2 2022-08-17 [1] CRAN (R 4.2.0) reticulate 1.28 2023-01-27 [1] CRAN (R 4.2.0) rlang 1.1.1 2023-04-28 [1] CRAN (R 4.2.0) rmarkdown 2.20 2023-01-19 [1] CRAN (R 4.2.0) rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.2.0) rstudioapi 0.14 2022-08-22 [1] CRAN (R 4.2.0) sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.0) shiny 1.7.4 2022-12-15 [1] CRAN (R 4.2.0) stringi 1.7.12 2023-01-11 [1] CRAN (R 4.2.0) stringr 1.5.0 2022-12-02 [1] CRAN (R 4.2.0) tensorflow * 2.11.0 2022-12-19 [1] CRAN (R 4.2.0) tfruns 1.5.1 2022-09-05 [1] CRAN (R 4.2.0) tibble 3.2.1 2023-03-20 [1] CRAN (R 4.2.0) urlchecker 1.0.1 2021-11-30 [1] CRAN (R 4.2.0) usethis 2.1.6 2022-05-25 [1] CRAN (R 4.2.0) utf8 1.2.3 2023-01-31 [1] CRAN (R 4.2.0) vctrs 0.6.2 2023-04-19 [1] CRAN (R 4.2.0) whisker 0.4.1 2022-12-05 [1] CRAN (R 4.2.0) withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.0) xfun 0.37 2023-01-31 [1] CRAN (R 4.2.0) xtable 1.8-4 2019-04-21 [1] CRAN (R 4.2.0) yaml 2.3.7 2023-01-23 [1] CRAN (R 4.2.0) [1] /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library ─ Python configuration ─────────────────────────────── python: /Users/nick/Library/r-miniconda-arm64/envs/greta-env-tf2/bin/python libpython: /Users/nick/Library/r-miniconda-arm64/envs/greta-env-tf2/lib/libpython3.8.dylib pythonhome: /Users/nick/Library/r-miniconda-arm64/envs/greta-env-tf2:/Users/nick/Library/r-miniconda-arm64/envs/greta-env-tf2 version: 3.8.15 | packaged by conda-forge | (default, Nov 22 2022, 08:49:06) [Clang 14.0.6 ] numpy: /Users/nick/Library/r-miniconda-arm64/envs/greta-env-tf2/lib/python3.8/site-packages/numpy numpy_version: 1.23.2 tensorflow: /Users/nick/Library/r-miniconda-arm64/envs/greta-env-tf2/lib/python3.8/site-packages/tensorflow NOTE: Python version was forced by use_python function ──────────────────────────────────────────────────────