Closed mkitti closed 2 years ago
Julia's PermutedDimsArray and StrideSubArray can transmitted to Python/Numpy without copying by taking advantage of numpy.transpose and Python's slice. The strategy is to create a view of the parent array and then use Numpy to perform similar view based operations on the array as was done in Julia
That shouldn't be necessary. You should be able to create a NumPy array directly from any array with strided data, just by broadening the signature of this method.
Julia's PermutedDimsArray and StrideSubArray can transmitted to Python/Numpy without copying by taking advantage of numpy.transpose and Python's slice. The strategy is to create a view of the parent array and then use Numpy to perform similar view based operations on the array as was done in Julia
That shouldn't be necessary. You should be able to create a NumPy array directly from any array with strided data, just by broadening the signature of this method.
Great. That will make things easier. I think we will need to modify some additional methods to dispatch to that method.
I'm not sure if I understand the dispatching with the code below in conversions.jl. It looks like we should be able to direct any AbstractArray
with the method stride
to the code you mention. Currently, such an array gets dispatched to:
PyObject(A::AbstractArray)
array2py(A::AbstractArray)
array2py(A::AbstractArray{<:Any, N}, dim::Integer, i::CartesianIndex{N})
Rather this needs to be directed to the NpyArray
method you mentioned.
Perhaps we should dispatch using a Holy trait:
PyObject(A::AbstractArray)
py2array(A::AbstractArray, isstrided::Val{true})
NpyArray(A::AbstractArray, revdims::Bool))
Basically, this would mean that PyCall would try to convert any AbstractArray
that has the method strided
to a NumPy array rather than relying on StridedArray
to dispatch.
Is changing PyObject
dispatch in this way what you had in mind?
I'm not sure we want to introduce a trait. Just changing this signature to PyObject(a::AbstractArray{T}) where T<:PYARR_TYPES
(and similarly for NpyArray
, which should probably be npyarray
) should suffice to duck-type it — any type that doesn't have strides
and a pointer conversion will fall back to array2py
.
Merging #876 (d04dc61) into master (ead312f) will increase coverage by
0.33%
. The diff coverage is80.64%
.
@@ Coverage Diff @@
## master #876 +/- ##
==========================================
+ Coverage 67.63% 67.97% +0.33%
==========================================
Files 20 20
Lines 1965 1992 +27
==========================================
+ Hits 1329 1354 +25
- Misses 636 638 +2
Flag | Coverage Δ | |
---|---|---|
unittests | 67.97% <80.64%> (+0.33%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files | Coverage Δ | |
---|---|---|
src/PyCall.jl | 69.56% <ø> (ø) |
|
src/gc.jl | 85.00% <62.50%> (-15.00%) |
:arrow_down: |
src/conversions.jl | 64.33% <84.21%> (+0.95%) |
:arrow_up: |
src/numpy.jl | 77.33% <100.00%> (+2.66%) |
:arrow_up: |
src/pyinit.jl | 82.82% <100.00%> (+0.17%) |
:arrow_up: |
src/pyarray.jl | 71.23% <0.00%> (+0.68%) |
:arrow_up: |
src/pyoperators.jl | 87.50% <0.00%> (+4.16%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update ead312f...d04dc61. Read the comment docs.
I was trying to sort out the Core.TypeofVararg
(see https://github.com/JuliaLang/julia/pull/38136 ) issue with the test on line 53. Turns out that the AOT CI script needing an overhaul now that some issues have been resolved. Previously it was always testing master.
@stevengj Let me know if you would like to split the slice
and PySlice
code into another PR. I could also rebase this if you would like.
I removed the slice code from this pull request. If we're interested, I can submit that as a distinct pull request. The diff is now smaller.
What’s the status on this? I would find it useful, so I would be willing to commit some time if necessary.
Hi @stevengj , I was going through my old PRs, and I was wonder where we were on this pull request to expand no-copy conversion of Julia arrays to NumPy arrays? Is there anything you would like to change before merging?
@stevengj bump
For those who need this sooner rather than later, here's a version I created for a Discourse post:
julia> using PyCall
julia> A = rand(Int, 256);
julia> pytypeof( PyObject(reinterpret(UInt64, A)) )
PyObject <class 'list'>
julia> pytypeof( PyObject(@view(A[1:100])) )
PyObject <class 'list'>
julia> module PyCallHack
import PyCall: NpyArray, PYARR_TYPES, @npyinitialize, npy_api, npy_type
import PyCall: @pycheck, NPY_ARRAY_ALIGNED, NPY_ARRAY_WRITEABLE, pyembed
import PyCall: PyObject, PyPtr
const HACKED_ARRAYS = Union{SubArray{T}, Base.ReinterpretArray{T}, Base.ReshapedArray{T}, Base.PermutedDimsArray{T}} where T <: PYARR_TYPES
function NpyArray(a::HACKED_ARRAYS{T}, revdims::Bool) where T <: PYARR_TYPES
@npyinitialize
size_a = revdims ? reverse(size(a)) : size(a)
strides_a = revdims ? reverse(strides(a)) : strides(a)
p = @pycheck ccall(npy_api[:PyArray_New], PyPtr,
(PyPtr,Cint,Ptr{Int},Cint, Ptr{Int},Ptr{T}, Cint,Cint,PyPtr),
npy_api[:PyArray_Type],
ndims(a), Int[size_a...], npy_type(T),
Int[strides_a...] * sizeof(eltype(a)), a, sizeof(eltype(a)),
NPY_ARRAY_ALIGNED | NPY_ARRAY_WRITEABLE,
C_NULL)
return PyObject(p, a)
end
pyembed(po::PyObject, jo::HACKED_ARRAYS) = pyembed(po, jo.parent)
end
Main.PyCallHack
julia> pytypeof( PyObject(reinterpret(UInt64, A)) )
PyObject <class 'numpy.ndarray'>
julia> pytypeof( PyObject(@view(A[1:100])) )
PyObject <class 'numpy.ndarray'>
@stevengj bump
I've initiated the registration of https://github.com/mkitti/NumPyArrays.jl which brings most of the functionality of this PR to a new type, NumPyArray
. NumPyArray
is just a wrapper of PyArray
but has a constructor for an AbstractArray
.
julia> using PyCall, NumPyArrays
julia> rA = reinterpret(Int8, rand(UInt8, 4,4))
4×4 reinterpret(Int8, ::Matrix{UInt8}):
115 -4 -20 -32
-43 13 -62 -36
-84 -37 91 -109
114 -40 9 73
julia> pytypeof(PyObject(rA))
PyObject <class 'list'>
julia> nprA = NumPyArray(rA)
4×4 NumPyArray{Int8, 2}:
115 -4 -20 -32
-43 13 -62 -36
-84 -37 91 -109
114 -40 9 73
julia> pytypeof(nprA)
PyObject <class 'numpy.ndarray'>
julia> sum(nprA)
-52
julia> pointer(rA) == pointer(nprA)
true
julia> np = pyimport("numpy")
PyObject <module 'numpy' from 'C:\\Users\\kittisopikulm\\.julia\\conda\\3\\lib\\site-packages\\numpy\\__init__.py'>
julia> np.sum(nprA)
-52
julia> PyObject(rA).dtype
ERROR: KeyError: key :dtype not found
Stacktrace:
[1] __getproperty(o::PyObject, s::Symbol)
@ PyCall ~\.julia\packages\PyCall\BD546\src\PyCall.jl:307
[2] getproperty(o::PyObject, s::Symbol)
@ PyCall ~\.julia\packages\PyCall\BD546\src\PyCall.jl:312
[3] top-level scope
@ REPL[33]:1
julia> PyObject(nprA).dtype
PyObject dtype('int8')
julia> py"""
def add_one(A):
return A + 1
"""
julia> py_add_one = py"add_one"
PyObject <function add_one at 0x00000000014BFF70>
julia> py_add_one(nprA)
4×4 Matrix{Int8}:
116 -3 -19 -31
-42 14 -61 -35
-83 -36 92 -108
115 -39 10 74
julia> py_add_one(rA)
ERROR: PyError ...
julia> py_add_one(PyObject(rA))
ERROR: PyError ...
I intend on closing this pull request on January 11th.
Closing this. See https://github.com/mkitti/NumPyArrays.jl if interested.
StridedArray
s can currently be converted to NumPy arrays with PyCall.jl. However, other types ofAbstractArray
also have astride
method. In particular, views such asSubArray
,ReshapedArray
,ReinterpretArray
, andPermutedDimsArray
implementstride
but are not included inStridedArray
.This pull request:
PyCall.NpyArray
to accept anyAbstractArray
PyObject
.To support slicing, a new typePySlice
is created.