ledatelescope / bifrost

A stream processing framework for high-throughput applications.
BSD 3-Clause "New" or "Revised" License
64 stars 29 forks source link

Intermittent `bifrost.linalg` test failures #187

Open jaycedowell opened 1 year ago

jaycedowell commented 1 year ago

Occasionally we see test failures on the self-hosted bifrost.linalg suite. Now that I'm looking for one to point to I cannot find one.

jaycedowell commented 1 year ago

Here's one: https://github.com/ledatelescope/bifrost/pull/167#issuecomment-1152494636

jaycedowell commented 1 year ago

I wonder if this is somehow related to #210. The only places where BF_STATUS_UNSUPPORTED_SHAPE can be thrown from a LinAlg call are in linalg_kernels.cu:

These are all kind of trivial though. It's mostly value checking for the matrix shape. There are a couple of comparisons of the batch size with the texture memory size that can also throw this. It would be nice to know exactly which BF_STATUS_UNSUPPORTED_SHAPE we are hitting.