-
**Describe the bug**
Pythia model outputs don't exactly match the Huggingface Transformers implementation.
**Code example**
```
def check_similarity_with_hf_model(tl_model, hf_model, atol, promp…
-
Hi Xiyuan,
Here is the supplement to the paper list (NIPS 2023 and ICML 2023):
NIPS23-ForecastPFN: Synthetically-Trained Zero-Shot Forecasting
NIPS23-Neural Lad: A Neural Latent Dynamics Fram…
-
### Is there an existing request for this?
- [X] I have searched the existing requests
### Forums discussion
https://forum.freecadweb.org/viewtopic.php?f=19&t=66805&p=577475#p577475
### Subproject…
-
Thank you for your very interesting work!
One thing i really really wish to be taken into consideration, is runtime-storable versions of `lease`, aka, "non-temporary lifetime".
Currently rust …
-
您好,我单卡运行没有问题,多卡运行的时候就会出现这样的问题,希望能够得到您的解答
File "trainer.py", line 611, in
main()
File "trainer.py", line 605, in main
trainer.fit(model)
File "/home/work/anaconda3/envs/wxp_torch/lib/…
-
Let's use this as a space to collect issues about other codes:
What other codes do we want to interface with?
What purposes will they serve in this project and what modules do they belong in?
How d…
-
### Feature request
Flash Attention 2 is a library that provides attention operation kernels for faster and more memory efficient inference and training: https://github.com/Dao-AILab/flash-attentio…
-
Hey Stuart,
I'm just in the process of printing the parts you've designed for the structure of the scanner.
I was wondering if you had any ideas on how you could integrate bracketing to enable c…
-
I have a project that requires some really good accuracy when measuring an object. The object measures about 1000 mm with a desired accuracy below 0.1 % at 1000 mm.
For that, I'm testing first the…
-
### 🐛 Describe the bug
I understand that this error came out of flash attention software stack, but it seems there is no related issue except for #https://github.com/Dao-AILab/flash-attention/issues/…