-
Hi. I’m a high school Go and AI enthusiast, and I am working on a KataGo-related project in the context of a three-year high school science research class. I would greatly appreciate assistance with…
-
**Motivation**
Fused multiply–add (FMA) is a floating-point operation performed in one step, with a single rounding. FMA can speed up and improve the accuracy of many computations: dot product, mat…
-
I see you are getting markedly slow results with Caffe-Greentea. Which backends are you using, and do you know if they are the best available ?
In ( @naibaf7 ) Fabian Tschopp's tech report http://ar…
-
https://arxiv.org/abs/1801.01671
-
hi,I'm from sd.cpp.I am very interested in the Winograd convolution algorithm you mentioned, and I'd like to know how its progress is going. I wonder why it's no longer on the sd.cpp to-do list.
-
The atomic convolutions currently take in an entire protein-ligand complex. This makes working with them very gnarly since proteins get large (tens of thousands of atoms easily). Since the coordinates…
-
## 🐛 Bug
Convolutions with dilation and groups are 1~2 orders of magnitude slower than their float32 counterparts.
With the following, script, the float32 conv2d runs in ~25ms while the quantize…
-
#### Description:
The FFTW library can be used seamlessly with Eigen and their FFT implementation which makes everything simple to use with Stan. Using FFTW has proven to yield huge speedups in our m…
-
This generalization will require the following changes:
- [x] add support for new scale(s): fragmentation function scale. Requires changing the order structure, which is related to . This is being …
-
It would be nice to have FFT based convolution supported in mlx. FFT bases convolution shows much better performance for large images / arrays and kernels. The FFT building blocks are already support…