Benny-Nottonson / Mojo-Marathons

Apache License 2.0
28 stars 28 forks source link

Bug with fma operator #2

Open andresnowak opened 2 months ago

andresnowak commented 2 months ago

There seems to be a when using the fma operator in the basic_matmul function, when running it in linux (PopOs at least, mojo 24.4) I get this error assertionError: 24.4375 is not close to 24.453125 with a diff of 0.015625 mojo: error: execution exited with a non-zero result: 1 but changing it to normal res + b * a it works, but in macOs the fma operator works correctly for the tests. P.S: The error seems to be related with float16?, with float32 and 64 (int16 also works correctly and bfloat16 doesn't) the fma operator works but with float16 I get the error, maybe an alignment error happens? (I dont know if you want to remove the fma operator for now at least for the test basic_matmul and use float32)

Benny-Nottonson commented 2 months ago

I would assume this is a numeric stability issue, I was actually thinking about switching the tests to Int16 since I want to encourage less stable algorithms like Strassen, any thoughts?

andresnowak commented 2 months ago

Hmmm I really dont know, like if the idea for the competition is for AI things then one would want a stable algorithm, but if the idea is more general then maybe, the thing is if it isn't a stable algorithm then it isn't good really for what these things would be used so I would think a stable algorithm is better. For the fma error I think is something else because the error is of 1e-2 and it only happens in linux (at least on my machine, and if it is in linux in general then I would think it is a very big error) and with float16 and bfloat16 i Just don't know how to report the error (because i think the error is in the implementation), or well maybe i can report it just linking to these repo.