-
If I create an onnx file with this sample script and [input.txt](https://github.com/Xilinx/finn-base/files/8809937/input.txt):
```python
import torch
import torch.nn as nn
import torch.nn.functi…
-
### 🔖 Feature description
We've seen marketing from Unsloth that optimized triton kernels for various operations can significantly improve both the speed and memory efficiency of fine-tuning LoRA a…
-
I am currently attempting to port a llama-like model architecture from pure pytorch to TransformerEngine's pytorch classes.
However, I have been unable to obtain identical results in certain cases.…
-
I wrote the code in the terminal:
`CUDA_VISIBLE_DEVICES=6 python main.py --base configs/stable-diffusion/v1-inference.yaml --gpus=1`
but this script was printed:
```
Global seed set to 23
Runni…
-
When I'm using the train.py in compvis, this bug come out, and I don't konw how to solve it. Anyone can help me? Thanks!
angogh_painting" --train_size 200
Global seed set to 23
Running on GPUs 0…
-
Hi all,
I found that using Adam-mini 1.0.1 cannot run in 4 shards, it would threw the exception related to Tensor reshaping:
```
File "/opt/conda/lib/python3.10/site-packages/adam_mini/adam_m…
-
## Instructions To Reproduce the Issue:
I installed Detectron2 and attempted to train the ViTDet base model from the documentation provided here: https://github.com/facebookresearch/detectron2/tree…
-
### System Info
4x NVIDIA H100, TensorRT-LLM backend 0.9.0
### Who can help?
@Tracin
### Information
- [X] The official example scripts
- [ ] My own modified scripts
### Tasks
-…
-
Hi thanks for the sharing of your code. In the paper, you implemented a baseline called "BERT+MLP", reaching a **76.2** F1 score. But when I use the same architecture, I cannot get the same result. Di…
-
### System Info
- CPU:4090 * 4
- TensorRT-LLm : v0.8.0
- CUDA Version: 12.3
- NVIDIA-SMI 545.29.06
### Who can help?
_No response_
### Information
- [X] The official example scripts
…