Open jakubMitura14 opened 2 years ago
I'm no GPU expert, but you should be able to use the cuda array interface (https://numba.pydata.org/numba-doc/dev/cuda/cuda_array_interface.html) to get the pointer to the data.
PythonCall does something similar (https://github.com/cjdoris/PythonCall.jl/blob/main/src/pywrap/PyArray.jl) to wrap python objects that have the numpy array interface.
Thanks, so I made progress thanks to your hints I changed it into numba cuda array still it is very slow although function does nothing - I mean theat just passing to argument gets 3633 miliseconds ( I am working on CUDA accelerated segmentation metrics and on data with the same size whole calculation takes around 50 ms )
import numba
import numpy as np
import torch
from statistics import median
import timeit
import julia
from juliacall import Main as jl
from numba import cuda
jl.seval("using Pkg")
jl.seval("""Pkg.add("CUDA")""")
jl.seval("""Pkg.add("PythonCall")""")
jl.seval("""using CUDA""")
jl.seval("""using PythonCall""")
jl.seval("""CUDA.allowscalar(true)""")
jl.seval("""print(sum(CUDA.ones(3,3,3)))""") #works
jl.seval("""function bb(arrGold)
end""")
def print_hi(name):
t1 = torch.tensor(np.ones((512,512,800))).to(torch.device("cuda"))
numbaArray = cuda.as_cuda_array(t1)
jl.bb(numbaArray)
def forBenchPymia():
numba.cuda.synchronize()
jl.bb(numbaArray)
numba.cuda.synchronize()
num_runs = 1
num_repetions = 1#2
ex_time = timeit.Timer(forBenchPymia).repeat(
repeat=num_repetions,
number=num_runs)
res= median(ex_time)*1000
print("bench")
print(res)
# t = torch.cuda.ByteTensor([2, 22, 222])
# c = cupy.asarray(t)
# c_bits = cupy.unpackbits(c)
# t_bits = torch.as_tensor(c_bits, device="cuda")
# print(t_bits.view(-1, 8))
# Press the green button in the gutter to run the script.
if __name__ == '__main__':
print_hi('PyCharm')
https://github.com/pabloferz/DLPack.jl might be of interest!
Thanks !!
This issue has been marked as stale because it has been open for 30 days with no activity. If the issue is still relevant then please leave a comment, or else it will be closed in 7 days.
This issue has been closed because it has been stale for 7 days. If it is still relevant, please re-open it.
hello I have cupy cuda array and I want to pass it into julia as is. CUDA arrays are just list of pointers so it should be possible from CUDA.jl side I know is possible as I have a comment """ for passing data the other way around you can use unsafe_wrap(CuArray, ...) to create a CUDA.jl array from a device pointer you get from Python """ still I can not make it work - anybody have some working example?
What I was trying