-
With Epod, we now have MLP emulators for growth function that are quite fast.
Do we want to include them in official pmwd code, and if so, how? Ideally one would like to keep a way of having both, M…
-
PCC on kv_cache and output is less than 0.99. To resolve this issue goal is to move matmuls to HiFi2 in attention, MLP and lm_head.
-
## Description
The KAN (Kolmogorov Activation Network) model from the pykan library currently only supports two-dimensional input tensors (batch_size x hid_dim). A `RuntimeError` is raised when att…
-
I am using `flax.linen.remat` on a module that has a `train` flag (used to check if the model is training). I'm using `static_argnums` on that flag, but am still getting a `ConcretizationTypeError` on…
-
### Did you check docs and existing issues?
- [x] I have read all the docs
- [x] I have searched the existing issues
### Version Information
```
>>> python -V
Python 3.7.12
```
```bash
mmpo…
-
thank you for your work. have your ever compare with other way for classfing the faceID, such as feature distance not to train a MLP classifer.
As we know, MLP classifer is diffcult to expand when th…
-
https://github.com/karpathy/minGPT/blob/37baab71b9abea1b76ab957409a1cc2fbfba8a26/mingpt/model.py#L42
Why do we need an additional linear transformation after the MHA and before the MLP when the dim…
-
Hi,
Are you using a projection network with following dimensions resnet (output) -> 512 -> 2048 -> 2048 ?
If that is the case then I am curious to know that why you decided to do it like this a…
-
**Description:** During conversations for R3/MLPs, it was determined that it would be helpful to have documentation for the reasoning behind each proposed component. This will be used during conversat…
-
**System information**
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow.js): true
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Arch Linux…