fjarri / reikna

Pure Python GPGPU library
http://reikna.publicfields.net/
MIT License
164 stars 16 forks source link

Scan doesn't seem to work #31

Closed robertmaxton42 closed 6 years ago

robertmaxton42 commented 6 years ago

I can't seem get Scan to work out of the box:

import numpy as np
import reikna.cluda as cluda
from reikna.cluda import Snippet
from reikna.algorithms import Scan, Predicate, predicate_sum

api = cluda.cuda_api()
thrt = api.Thread.create()

test = np.random.randint(0,256, size=(256, 100), dtype=np.ubyte)
devtest = thrt.to_device(test)
out = thrt.array((256, 100), dtype=np.ubyte)
axsum = np.cumsum(test, 0)
scan = Scan(devtest, predicate_sum(np.ubyte), axes=0, exclusive=False)
scankern = scan.compile(thrt)
scankern(out, devtest)

just gets me the context of devtest, unchanged, in out.

fjarri commented 6 years ago

Are you sure that the result is exactly the same as devtest, and not just consists of byte-sized numbers? Because that's actually the expected behaviour.

Note that np.cumsum automatically promotes the result to uint64 to accommodate the sums. Scan keeps the results in the same dtype, that is, ubyte. If you do the same with numpy:

axsum = np.cumsum(test, 0).astype(np.ubyte)

The contents of out should be exactly the same as axsum, or, at least, they are on my machine. Please tell me if it is not the case for you.

robertmaxton42 commented 6 years ago

.... How did I miss that? I even left a note to myself to remember not to allow windows above a certain size precisely to avoid this in release, but didn't think of it when I was writing test code...

Thanks for catching that. Sorry to bother you.

fjarri commented 6 years ago

No worries, it is always possible I made some stupid mistake and missed it with tests :)