stefan-m-lenz / JuliaConnectoR

A functionally oriented interface for calling Julia from R
Other
100 stars 6 forks source link

DataFrame conversion issue #16

Closed stefan-m-lenz closed 2 years ago

stefan-m-lenz commented 2 years ago

I'm very new to Julia (and therefore JuliaConnectoR), so I'm not sure if this is a bug or expected behavior. It seems that as.data.frame() is very particular about the type of Table.jl it supports. For example, I've had to assign Julia DataFrame columns to Float64 and String types to avoid errors with being unable to convert Type ::Any. I suspect, similarly, it appears to matter how the Table.jl is created. See below for a reprex.

# This doesn't work
test_table1 <- juliaLet('
    table(x, z)',
    x = 1:10,
    z = 21:30
)
as.data.frame(test_table1)

# This works
test_table2 <- juliaEval('table(DataFrame(x = 1:10, y = 21:30))')
as.data.frame(test_table2)
r$> test_table1
<Julia object of type IndexedTable{StructArrays.StructVector{Tuple{Int64, Int64}, Tuple{Vector{Int64}, Vector{Int64}}, Int64}}>
Table with 10 rows, 2 columns:

r$> test_table2
<Julia object of type IndexedTable{StructArrays.StructVector{NamedTuple{(:x, :y), Tuple{Int64, Int64}}, NamedTuple{(:x, :y), Tuple{Vector{Int64}, Vector{Int64}}}, Int64}}>
Table with 10 rows, 2 columns:

Apologies if I'm missing something due to my lack of experience with Julia, and thanks for your work with this package - it's great to be able to use R when it's necessary.

Originally posted by @arnold-c in https://github.com/stefan-m-lenz/JuliaConnectoR/issues/1#issuecomment-1146333953

stefan-m-lenz commented 2 years ago

@arnold-c Thanks for reporting this. In the case of test_table1, the names of the columns in the table are of type Int64, while in test_table2 they are of type Symbol. I didn't think of different types as column names in Julia when writing the translation mechanism. In R, the type of column names is always character. I am not sure how to handle this in the best way. One possibility would be to simply convert the integer numbers to string. I have to think whether this could cause any other problems or inconsistencies in the API design. Then I could handle this case, which I haven't thought of before.

As for the conversion of type Any, I am not sure how to improve this. Do you have an example where you used the type Any?

stefan-m-lenz commented 2 years ago

The issue is now fixed in the master branch. It is possible to translate tables that do not have column names of type Symbol. When translating them via as.data.frame, the names are converted to Symbols.