-
Hi, this is great work, and thanks for releasing the code!
I found that there is no dropout in the llama models, and I wonder if it is a specific design choice? I could also have missed it, but I …
-
**Describe the bug**
When loading TinyLlama or Llama-3-8B with dtype=int4, the model structure looks:
```
LlamaForCausalLM(
(model): LlamaModel(
(embed_tokens): Embedding(128256, 4096)
…
-
Hello,
we have noticed some unexpected behaviors when fine-tuning a llama 3 model on 1 gpu and when fine-tuning the same model on the same data set with 2 gpus in parallel mode. See the attached te…
-
from the support inbox + lines:
> The stock card was never used before because I've only worked with the w/delay since I bought it in October, but I've tried with another one and the issue is still…
-
The dropout layers cause the spikes (1) to scale to 1.42...
Que? When I remove the dropout its ok
spk_conv1 = self.dropout1(spk_conv1)
-
您好,我写了一个onnx导出脚本,只导出unet.model,然而导出后文件并不是保存在一个model.onnx中,,而是model.onnx只保存文件结构,而权重保存成零散的文件?
导出代码如下:
```
# ===============================构建算子
import onnxscript
## Assuming you use opset18
fr…
-
Hi @oguiza,
Thanks again for the tsai library.
I’m having an issue with the TSTPlus model. The dropout for the fully connected head (fc_dropout) seems to have no affect on the training. I can set …
-
In pytorch, one can call `model.train()` to turn all dropout layers on for training, and one can call `model.eval()` to turn all dropout layers off for evaluation.
For the transformer implementatio…
-
Hi @joschif !
Thank you for this helpful package :) I have a couple of questions!
To start, I will just say - I have already tested a lot of parameters in my multiome dataset and can use the def…
-
### Checklist
- [X] I'm reporting a site feature request
- [X] I've verified that I'm running yt-dlp version **2021.12.27**. ([update instructions](https://github.com/yt-dlp/yt-dlp#update))
- [X] I'v…