stefan-m-lenz / JuliaConnectoR

A functionally oriented interface for calling Julia from R
Other
102 stars 6 forks source link

converting dataframes doesn't work #22

Closed joelnitta closed 1 year ago

joelnitta commented 1 year ago

I see there have been a couple of other issues opened and closed on this so sorry if it's a repeat, but I can't get loading dataframes from julia to R to work. Here is an example:

library(JuliaConnectoR)

Pkg <- juliaImport("Pkg")
#> Starting Julia ...

Pkg$add(Pkg$PackageSpec(name = "DataFrames"))

DataFrames <- juliaImport("DataFrames")

df_j <- DataFrames$DataFrame(A=1:3, B=5:7, fixed=1)

# julia representation
df_j
#> <Julia object of type DataFrames.DataFrame>
#> 3×3 DataFrame
#>  Row │ A      B      fixed
#>      │ Int64  Int64  Float64
#> ─────┼───────────────────────
#>    1 │     1      5      1.0
#>    2 │     2      6      1.0
#>    3 │     3      7      1.0

# read into R: doesn't look like a data.frame
juliaGet(df_j)
#> $columns
#> $columns[[1]]
#> [1] 1 2 3
#> 
#> $columns[[2]]
#> [1] 5 6 7
#> 
#> $columns[[3]]
#> [1] 1 1 1
#> 
#> attr(,"JLTYPE")
#> [1] "Vector{AbstractVector}"
#> 
#> $colindex
#> $colindex$lookup
#> $colindex$lookup$keys
#> $colindex$lookup$keys[[1]]
#> A
#> 
#> $colindex$lookup$keys[[2]]
#> B
#> 
#> $colindex$lookup$keys[[3]]
#> fixed
#> 
#> attr(,"JLTYPE")
#> [1] "Vector{Symbol}"
#> 
#> $colindex$lookup$values
#> $colindex$lookup$values[[1]]
#> [1] 1
#> 
#> $colindex$lookup$values[[2]]
#> [1] 2
#> 
#> $colindex$lookup$values[[3]]
#> [1] 3
#> 
#> 
#> attr(,"JLTYPE")
#> [1] "Dict{Symbol, Int64}"
#> 
#> $colindex$names
#> $colindex$names[[1]]
#> A
#> 
#> $colindex$names[[2]]
#> B
#> 
#> $colindex$names[[3]]
#> fixed
#> 
#> attr(,"JLTYPE")
#> [1] "Vector{Symbol}"
#> 
#> attr(,"JLTYPE")
#> [1] "DataFrames.Index"
#> 
#> attr(,"JLTYPE")
#> [1] "DataFrames.DataFrame"

# try to convert to dataframe
as.data.frame(juliaGet(df_j))
#> Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : object 'A' not found

Created on 2023-02-07 with reprex v2.0.2

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.2.1 (2022-06-23) #> os macOS Big Sur ... 10.16 #> system x86_64, darwin17.0 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz Asia/Tokyo #> date 2023-02-07 #> pandoc 2.19.2 @ /usr/local/bin/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> ! package * version date (UTC) lib source #> P cli 3.6.0 2023-01-09 [?] CRAN (R 4.2.0) #> P digest 0.6.31 2022-12-11 [?] CRAN (R 4.2.0) #> P evaluate 0.20 2023-01-17 [?] CRAN (R 4.2.0) #> P fastmap 1.1.0 2021-01-25 [?] CRAN (R 4.2.0) #> P fs 1.6.1 2023-02-06 [?] CRAN (R 4.2.1) #> P glue 1.6.2 2022-02-24 [?] CRAN (R 4.2.0) #> P htmltools 0.5.4 2022-12-07 [?] RSPM (R 4.2.1) #> P JuliaConnectoR * 1.1.1 2022-01-26 [?] CRAN (R 4.2.0) #> P knitr 1.42 2023-01-25 [?] CRAN (R 4.2.0) #> P lifecycle 1.0.3 2022-10-07 [?] CRAN (R 4.2.0) #> P magrittr 2.0.3 2022-03-30 [?] CRAN (R 4.2.0) #> P purrr 1.0.1 2023-01-10 [?] CRAN (R 4.2.0) #> P R.cache 0.16.0 2022-07-21 [?] RSPM (R 4.2.0) #> P R.methodsS3 1.8.2 2022-06-13 [?] CRAN (R 4.2.0) #> P R.oo 1.25.0 2022-06-12 [?] CRAN (R 4.2.0) #> P R.utils 2.12.2 2022-11-11 [?] CRAN (R 4.2.0) #> P reprex 2.0.2 2022-08-17 [?] CRAN (R 4.2.0) #> P rlang 1.0.6 2022-09-24 [?] CRAN (R 4.2.0) #> P rmarkdown 2.20 2023-01-19 [?] CRAN (R 4.2.0) #> sessioninfo 1.2.2 2021-12-06 [3] CRAN (R 4.2.0) #> P styler 1.9.0 2023-01-15 [?] CRAN (R 4.2.0) #> P vctrs 0.5.2 2023-01-23 [?] CRAN (R 4.2.1) #> P withr 2.5.0 2022-03-03 [?] CRAN (R 4.2.0) #> P xfun 0.37 2023-01-31 [?] CRAN (R 4.2.0) #> P yaml 2.3.7 2023-01-23 [?] CRAN (R 4.2.0) #> #> [1] /Users/joelnitta/repos/biogeo_julia/renv/library/R-4.2/x86_64-apple-darwin17.0 #> [2] /Users/joelnitta/repos/biogeo_julia/renv/sandbox/R-4.2/x86_64-apple-darwin17.0/84ba8b13 #> [3] /Library/Frameworks/R.framework/Versions/4.2/Resources/library #> #> P ── Loaded and on-disk path mismatch. #> #> ────────────────────────────────────────────────────────────────────────────── ```
stefan-m-lenz commented 1 year ago

The problem is that you use juliaGet here but it results in another R object that cannot be transformed into an R Data Frame. Instead you can simply use as.data.frame directly on df_j to get the R Data Frame.

> as.data.frame(df_j)
  A B fixed
1 1 5     1
2 2 6     1
3 3 7     1
joelnitta commented 1 year ago

Thanks @stefan-m-lenz!

However, I am still getting an error with a different data frame:

> juliaEval('using JLD2')
> juliaEval('load_object("geog_df.jld2")')
<Julia object of type DataFrame>
19×5 DataFrame
 Row │ tipnames                  K    O    M    H
     │ Any                       Any  Any  Any  Any
─────┼──────────────────────────────────────────────
   1 │ P_mariniana_Kokee2        1    0    0    0
   2 │ P_mariniana_Oahu          0    1    0    0
   3 │ P_mariniana_MauiNui       0    0    1    0
   4 │ P_hawaiiensis_Makaopuhi   0    0    0    1
   5 │ P_wawraeDL7428            1    0    0    0
   6 │ P_kaduana_PuuKukuiAS      0    0    1    0
   7 │ P_mauiensis_PepeAS        0    0    1    0
   8 │ P_hawaiiensis_WaikamoiL1  0    0    1    0
  ⋮  │            ⋮               ⋮    ⋮    ⋮    ⋮
  13 │ P_greenwelliae07          1    0    0    0
  14 │ P_greenwelliae907         1    0    0    0
  15 │ P_grandiflora_Kal2        1    0    0    0
  16 │ P_hobdyi_Kuia             1    0    0    0
  17 │ P_hexandra_K1             1    0    0    0
  18 │ P_hexandra_M              1    0    0    0
  19 │ P_hexandra_Oahu           0    1    0    0
                                      4 rows omitted
> as.data.frame(juliaEval('load_object("geog_df.jld2")'))
Error: Evaluation in Julia failed.
Original Julia error message:
StackOverflowError:
Stacktrace:
 [1] r_compatible_type(t::Type{Any}) (repeats 79984 times)
   @ Main.RConnector ~/Library/Caches/org.R-project.R/R/renv/cache/v5/R-4.2/x86_64-apple-darwin17.0/JuliaConnectoR/1.1.1/68962adf163c6be3674eb6128b7b74a0/JuliaConnectoR/Julia/handling_dataframes.jl:52
> 
stefan-m-lenz commented 1 year ago

The type Any can't be converted to a data frame element in R. You have to use a type that is also usable in R data frames. That the scenario here results in a stackoverflow error is of course not optimal. It would certainly be better if a proper error message would be thrown. Anyway, you can solve your problem by using a type like Bool, Intor Float64 in Julia instead of Any.

joelnitta commented 1 year ago

I see, thanks! Not my julia code, but I will suggest that to the maintainer.

I'll leave this open as a reminder about the error handling for this problem, but feel free to close if you don't intend to fix that.

stefan-m-lenz commented 1 year ago

Thanks, I've fixed it now in the master branch:

> juliaEval("using DataFrames")
> x <- juliaEval('DataFrame(A = ["A", 1, 2], B = 5:7)')
> x
<Julia object of type DataFrame>
3×2 DataFrame
 Row │ A    B
     │ Any  Int64
─────┼────────────
   1 │ A        5
   2 │ 1        6
   3 │ 2        7
> as.data.frame(x)
 Error: Evaluation in Julia failed.
Original Julia error message:
Column type "Any" cannot be translated to a type that can be used in an R data frame
...