-
torch version: 2.5.0.dev20240616+cu121
python version: python 3.8
I run the llama example with torchrun --nproc-per-node 2 pippy_llama.py. It got an Error
```
Loading checkpoint shards: 100%|███…
-
I tried to follow the information on readme.
I created a **L4 type of GPU** (_Based on Ada Lovelace architecture as mentioned by you_) via Google Cloud workbench notebook with a lot of storage and ra…
-
Hi, I downloaded and ran your program, and got a training error as above. I have no GPU, so I changed the setup to fp16 = 'false' (xlnet left as your demo choice).
What's the problem?
Darrell…
-
# Summary of the Issue
Hi, thanks for all your effort spent on this repo.
Trying to setup a working example of your `demo.ipynb` in Google Colaboratory, with some custom settings.
For example, mo…
-
Problem: When resuming the training of a BERT model with the Hugging Face Trainer from a checkpoint, the loss value increases again in the second run, even though the checkpoint is loaded correctly an…
-
- ICDAR 2021
- https://icdar2021.org
- Accepted papers are available
- https://ieeexplore.ieee.org/search/searchresult.jsp?newsearch=true&queryText=ICDAR%202021
- query = ICDAR 2021
…
-
### Version
2.3.0+Win64
### Operating system type + version
Windows10
### 3D printer brand / version + firmware version (if known)
Prusa I3 MKS+
### Behavior
Rotation and Scale controls a…
-
Hi @PanQiWei and @qwopqwop200
I have encountered a strange bug that is specific to group_size = 1024 + desc_act=False + CUDA inference.
Last night I did a bunch of quantisations, covering all p…
-
When selecting this in `accelerate config`:
```
Do you wish to optimize your script with torch dynamo?[yes/NO]:yes
--------------------------------------------------------------------------------…
-
**Describe the bug**
For some reason the following code gives me the the error: `RuntimeError: The specified pointer resides on host memory and is not registered with any CUDA device.`
The first cal…
La1c updated
4 months ago