Closed jaycedowell closed 2 years ago
Yeah, 11 of 17 tests are failing. For what is it worth the ones that fail are the polmajor=False
ones.
I think I have seen these once before, probably on qblocks. I may have a log file somewhere with some details including architectures. But it didn't happen on most of their machines.
Update: I don't have the log file for when romein failed… just fond memories. 😏 Now I wonder if maybe it was on google colab.
After some digging it looks like the polmajor=False
failures are related to this change to bifrost.ndarray that I made. I'm not sure why this is a problem,
More digging shows that the problem is with bifrost.ndarray.copy()
. Specifically it does not check if the array is C contiguous before it copies. It only assumes that it is.
Maybe it's not so clear cut. Adding strides=...
to bifrost.ndarray
does seem to be the right thing to do but I'm having trouble reproducing the full polmajor=False
error with a simple numpy/Bifrost comparison.
No, it is clear what is going on and there are two problems:
strides[i] > strides[i+1]
andbifrost.ndarray.copy()
doesn't work that same as numpy.ndarray.copy()
. numpy.ndarray.copy()
will make things C-ordered if they are not already and bifrost.ndarray.copy()
keeps whatever memory layout was there.(1) wouldn't be so much of an issue if (2) wasn't also happening.
Closing with the merge of #174.
On the test machine I'm getting a variety of failures on the
test_romein
suite. I haven't run into these before so I'm wondering if it has something to do with the GPU/version of CUDA that we are using on the test machine (RTX A4000; arch. 86; CUDA 11.2)?