-
#2731 introduced a mechanism to optimize some of the kernel launch parameters.
However, there are some functions throughout the code base that rely on empirical measurements to select different alg…
-
### Describe the bug
Hi,
Sorry to bother you, I recently updated FLA on an 8xH100 machine and it now gives new errors during autotuning with fla.ops.simple_gla.chunk that were not present previo…
-
With OpenCL and our current format interface, very slow formats and/or weak devices may lead to situations where we "can't" run at optimal work size because the total duration of each crypt_all() call…
-
### Checkboxes
- [X] I agree to follow our [Code of Conduct](https://github.com/fdm-monster/fdm-monster/blob/develop/CODE_OF_CONDUCT.md).
- [X] I have verified no [other issues](https://github.com/fd…
-
I'm not sure to include x-tall or rely on the internal capacitive oscillator. I don't know how much wear internal oscillator gets. The answer may be in datasheet gotta check it.
-
Current instructions for https://wiki.fome.tech/How-to-edit-wiki/#how-to-test-your-changes are incomplete/wrong
See https://github.com/FOME-Tech/wiki/pull/142#issuecomment-1705470095 for details
-
Platforms: linux, slow
This test was disabled because it is failing in CI. See [recent examples](https://hud.pytorch.org/flakytest?name=test_precompilations&suite=TestMaxAutotune&limit=100) and the…
-
**Hello! I use auto-gptq to quantized `llama-2-7b-instruct` model to `llama-2-7b-instruct-4bit-128g`. And i try to compare the speed between them. But the result is very strange. The storage of the qu…
-
Hi @Celebio ,
the latest commit added a `seed` option in args (https://github.com/facebookresearch/fastText/blob/4aca28c60569e5596b21bc054d01fdfe64a1cc37/src/args.h#L54).
But Couldn't find it in…
-
**System information**
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux CentOS 8/Windows 11
- TensorFlow version and how it was installed (source or binary): 2.6.0 binary
- TensorFl…