JuliaPy / PyCall.jl

Package to call Python functions from the Julia language
MIT License
1.47k stars 190 forks source link

Convert to sparse #204

Open edljk opened 9 years ago

edljk commented 9 years ago

Is there a way to convert a PyObject containing a sparse matrix to a Julia sparse Matrix type ?

@pyimport scipy.sparse as sp
Id = sp.eye(4)
Idj = convert(SparseMatrixCSC{Float64,Int64},Id)

ERROR: MethodError: `convert` has no method matching convert(::Type{SparseMatrixCSC{Float64,Int64}}, ::PyCall.PyObject)
This may have arisen from a call to the constructor SparseMatrixCSC{Float64,Int64}(...),
since type constructors fall back to convert methods.
Closest candidates are:
  call{T}(::Type{T}, ::Any)
  convert{Tv,Ti,TvS,TiS}(::Type{SparseMatrixCSC{Tv,Ti}}, ::SparseMatrixCSC{TvS,TiS})
  convert{Tv}(::Type{SparseMatrixCSC{Tv,Int64}}, ::Base.SparseMatrix.CHOLMOD.Sparse{Tv})
stevengj commented 9 years ago

No, right now PyCall does not know about the SciPy sparse-matrix formats.

stevengj commented 9 years ago

(You could write the conversion routine yourself on top of other PyCall functions, of course.)

edljk commented 9 years ago

Thank you Steven. I am new in Julia. Finally I used the (I guess very inefficient) steps

@pyimport scipy.sparse as sp
Apy = sp.rand(30,30,0.3)
IA, JA, SA = sp.find(Apy)
IA, JA = convert(Array{Int64},IA+1), convert(Array{Int64},JA+1)
Aju = sparse(IA,JA,SA)
stevengj commented 9 years ago

@edljk, glad you got it working. The sparse function works even if you omit the convert statement and just do sparse(IA+1,JA+1,SA), but it gives a sparse matrix with 32-bit indices. You can get 64-bit indices (on a 64-bit machine) without making an extra copy by:

const scipy_sparse_find = pyimport("scipy.sparse")["find"]
function mysparse(Apy::PyObject)
    IA, JA, SA = scipy_sparse_find(Apy)
    return sparse(Int[i+1 for i in IA], Int[i+1 for i in JA], SA)
end

(This is in Julia 0.4. To get code that works with Julia 0.3 too, you need to do pycall(scipy_sparse_find, PyAny, Apy).)

stevengj commented 9 years ago

To get much more efficient than this, I think one would have to write specialized code depending on the SciPy sparse-matrix format. e.g. for CSC or CSR storage you can probably do better.

nsaphra commented 8 years ago

I had the same exact need, and I ended up with this gist.

@pyimport scipy.sparse as pysparse

jlmat2pymat(S::SparseMatrixCSC) =
    pysparse.csc_matrix((S.nzval, S.rowval .- 1, S.colptr .- 1), shape=size(S))

pymat2jlmat(S::PyObject) =
    SparseMatrixCSC(S[:m], S[:n], S[:indptr] .+ 1, S[:indices] .+ 1, S[:data])

I'd like to just add conversion for sparse matrices as a feature for PyCall.jl, but I'm struggling to wrap my head around where it would fit into the existing structure. Could I just add these wrappers to numpy.jl and add it as a conversion step to SparseMatrixCSC?

stevengj commented 8 years ago

@nsaphra, thanks for looking into this. You would add something like:

PyObject(S::SparseMatrixCSC) =
    pyimport("scipy.sparse")["csc_matrix"]((S.nzval, S.rowval .- 1, S.colptr .- 1), shape=size(S))

convert(::Type{SparseMatrixCSC}, o::PyObject) =
    SparseMatrixCSC(o[:m], o[:n], o[:indptr] .+ 1, p[:indices] .+ 1, o[:data])

though it might be nice to add more checks.

Datseris commented 7 years ago

I just encountered this and I also needed the functions. Is this still open? If the code to be used is just what was given in @stevengj last comment, could this be just added and it is fine? Or there are more things necessary that I am missing? I am more than glad to do the PR, since it will take only 5 minutes (if this is indeed resolved).

EDIT: The functions provided by @nsaphra 's post do not work for me. My object is:

PyObject <1909x1909 sparse matrix of type '<class 'numpy.complex128'>'
    with 5616 stored elements in COOrdinate format>

and it doesn't have keys n or m...

Second edit: The original idea @edljk works, i.e.:

const scipy_sparse_find = pyimport("scipy.sparse")["find"]
function mysparse(Apy::PyObject)
    IA, JA, SA = scipy_sparse_find(Apy)
    return sparse(Int[i+1 for i in IA], Int[i+1 for i in JA], SA)
end

(as modified by @stevengj )

stevengj commented 7 years ago

A PR seems reasonable, although it should be structured so as not to create a dependency on scipy unless these functions are actually used. (i.e. the module should be imported lazily).

Datseris commented 7 years ago

Could you hint me how to do this? I've never do this "lazily import a module" before.

What I know is that I can use Requires with @require Juno do_stuff,,, but of course this couldn't work for python dependencies.

stevengj commented 7 years ago

Lazily importing a Python module:

const scipysparse_ = PyNULL() # will be initialized to scipy.sparse module when needed
scipysparse() = scipysparse_.o == C_NULL ? copy!(scipysparse_, pyimport_conda("scipy.sparse", "scipy")) : scipysparse_

and then use scipysparse() whenever you want a reference to the scipy.sparse module.

matteoacrossi commented 6 years ago

Are there any news on this?