Open MilesCranmer opened 1 month ago
As you can see in the document, AbstractArray
and AbstractDict
are implicitly converted to wrapper objects on the Python call.
In the first case, you should use pydict
function to convert a Julia's Dict
to a Python's dict
.
julia> df = pd.DataFrame(pydict(Dict("a" => [1, 2, 3], "b" => [4, 5, 6])))
Python:
b a
0 4 1
1 5 2
2 6 3
As in the first case, the necessity of the explicit call to the pylist function is required in the second case.
Thanks, that makes sense! I didn’t see pydict
.
So should this be closed or is there anything that can be done automatically?
The issue is that pandas.DataFrame.__init__
explicitly checks if its argument is a dict
and Py(::Dict)
is not a dict
(it's a juliacall.DictValue
). The two options to make this work automatically are:
Dict
to Python dict
. I'm not inclined to change this.pandas.DataFrame.__init__
to check if the argument is a abc.collections.Mapping
instead, which includes both dict
and juliacall.DictValue
.I think requiring pylist
to do the indexing is a similar issue - it checks for list
rather than the more general abc.collections.Sequence
, which includes both list
and juliacall.VectorValue
.
I think the solutions on pandas side sound like better options to me. I'm not sure if they have some edge cases which prevent them being more general... Like maybe some abc.collections.Sequence
acting as a single key?
cross-posted here: https://github.com/pandas-dev/pandas/issues/58803
Affects: PythonCall
Describe the bug
I have been trying to use pandas from PythonCall.jl and just wanted to document a few different calls that do not directly translate to Julia. I guess this might just mean we need a
PythonPandas
package to translate calls but I wonder if there's any missing methods that could be implemented to fix things automatically.First, the preamble for this:
pandas.DataFrame
:Using a similar syntax to Python:
which results in the following dataframe:
i.e., it seems to have a single column named "0" and rows for a and b.
If I instead write this as a vector of pairs, I get:
I suppose this one makes sense.
I was able to get it working with the following syntax instead:
So, selecting a single column works:
but multiple columns does not:
I got around this by inserting a
pylist
call: