-
Hi team,
we are currently adapting our training environment to use the fused attention functions. In one of our training setups, we work with batch size one and concaternate multiple documents along …
-
It is very early, I agree, but this is just to bring attention to Valve. With the recent update that was dropped, an [SDK](https://github.com/ValveSoftware/source-sdk-2013) update might be in order fo…
-
### System Info
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03 Driver Version: 560.35.03 CUDA Version: 12.6 |
|--------------------…
-
### Description
With `0.4.35` release, the flash attention kernel will hit compilation OOM for long sequence length inputs. It failed during compiling the reference attention implementation for cos…
-
For some reason, activity on this plugin was muted. No worries, I've fixed it and thank you all for your attention!
Will be fixing things as reported....many thanks to all the reporters and contrib…
-
I think the feed should show number of forecasters. This is very useful when interpreting a forecast you see in the feed without having to click into it, and also useful for admins to assess which que…
-
~/MusePose# accelerate launch train_stage_2.py --config configs/train/stage2.yaml
The following values were not passed to `accelerate launch` and had defaults used instead:
`--num_processes`…
-
Had an issue with my thumb drive while using bootqt and wasn't paying attention. Bootqt ended up wiping my entire main drive without so much as a prompt warning that it was attempting to clear the act…
-
I would like to report a potential bug I found, which I previously submitted via email on November 1st. Since I haven’t received a response, I wanted to follow up here in case the email was missed or …
-
### System Info
transformers == 4.45
torch == 2.4.1 + cu118
accelerate == 1.0.1
### Who can help?
_No response_
### Information
- [ ] The official example scripts
- [ ] My own modif…