JuliaPy / PythonCall.jl

Python and Julia in harmony.
https://juliapy.github.io/PythonCall.jl/stable/
MIT License
722 stars 61 forks source link

pyconvert_add_rule() ignored if cache has been populated #336

Closed hhaensel closed 9 months ago

hhaensel commented 1 year ago

Currently, if I define a new rule from the beginning, everything works fine:

using PythonCall
using DataFrames

pdf = pytable(DataFrame(:a => 1:2, :b => 2:3))

PythonCall.pyconvert_rule_pandasdataframe(::Type{DataFrame}, x::Py) = DataFrame(PyTable(x))
PythonCall.pyconvert_add_rule("pandas.core.frame:DataFrame", DataFrame, PythonCall.pyconvert_rule_pandasdataframe, PythonCall.PYCONVERT_PRIORITY_ARRAY)

pyconvert(Any, pdf)

results in

2×2 DataFrame       
 Row │ a      b     
     │ Int64  Int64 
─────┼──────────────
   1 │     1      2
   2 │     2      3

However, if I do conversion once before the rule is defined, the conversion rules are cached and no conversion is performed. So far that's expected. But even if I delete the rule cache, the new rule is not applied.

using PythonCall
using DataFrames

pdf = pytable(DataFrame(:a => 1:2, :b => 2:3))

pyconvert(Any, pdf)

PythonCall.pyconvert_rule_pandasdataframe(::Type{DataFrame}, x::Py) = DataFrame(PyTable(x))
PythonCall.pyconvert_add_rule("pandas.core.frame:DataFrame", DataFrame, PythonCall.pyconvert_rule_pandasdataframe, PythonCall.PYCONVERT_PRIORITY_ARRAY)

empty!(PythonCall.PYCONVERT_RULES_CACHE)

pyconvert(Any, pdf)

results in

2×2 PyPandasDataFrame
   a  b
0  1  2
1  2  3

The reason is that somehow @generated pyconvert_rules_cache does not recalculate the value after the cache has been emptied.

julia> PythonCall.pyconvert_rules_cache(Any)
Dict{Ptr{PythonCall.C.PyObject}, Vector{Function}} with 1 entry:
  Ptr{PyObject} @0x00000219a6cf69e0 => [#54, #54, #54, #54, #54, #54, #54, #54, #54, #54, #54]

If I omit the @generated in front of pyconvert_rules_cache, the new rules are considered.

If this is by design in order to gain performance, it's perhaps worth mentioning in the docs of pyconvert_add_rule(). Otherwise removing @generated in convert.jl could be a solution.

hhaensel commented 1 year ago

Another posibility would be offering a resetting function

function pyconvert_reset_cache!()
    empty!(PYCONVERT_RULES_CACHE)
    Core.eval(@__MODULE__, quote
        @generated pyconvert_rules_cache(::Type{T}) where {T} = get!(Dict{C.PyPtr, Vector{Function}}, PYCONVERT_RULES_CACHE, T)
    end)
end
github-actions[bot] commented 10 months ago

This issue has been marked as stale because it has been open for 30 days with no activity. If the issue is still relevant then please leave a comment, or else it will be closed in 7 days.

cjdoris commented 10 months ago

Hi, I'll take a look. AFAIR calling pyconvert_add_rule should reset caches already, but maybe one of the later-stage caches is not being cleared.

hhaensel commented 10 months ago

Hi, just read that you are looking into this. Meanwhile I have defined two functions for myself:

function pyconvert_reset_cache!()
    empty!(PythonCall.PYCONVERT_RULES_CACHE)
    Core.eval(PythonCall, quote
        @generated pyconvert_rules_cache(::Type{T}) where {T} = get!(Dict{C.PyPtr, Vector{Function}}, PYCONVERT_RULES_CACHE, T)
    end)
end

function pyconvert_reset!()
    empty!(PythonCall.PYCONVERT_RULES)
    empty!(PythonCall.PYCONVERT_EXTRATYPES)
    pyconvert_reset_cache!()
    PythonCall.init_pyconvert()
end
github-actions[bot] commented 9 months ago

Thank you for taking the time to report this issue! However, it has been marked as stale because there has been no activity for 30 days. As with many open source projects, the maintainers work for free and do not always have enough time to address all issues. For questions, please check the documentation, other issues, or consider asking elsewhere such as Stack Overflow or Julia Discourse. For bugs, please try to find a fix yourself and consider opening a pull request. If you're still stuck, please leave a comment below. If there is no further activity in 7 days, this issue will be automatically closed.