-
```python
batch_size, seq_length = 64, 5000
sel = Tensor.randint(batch_size, high=X_train.shape[0]-seq_length)
X = X_train[sel:sel+seq_length]
```
Doesn't work
```python
batch_size, seq_lengt…
-
Trying to repro the chess SAE trainining:
```
python circuits/sae_training/chess_sae_trainer.py --save_dir=/tmp/sae_debug
```
After modifying this line to pass the `meta.pkl` from `circuits/reso…
-
I'm trying to finetune **Mistral-Nemo-Base-2407** with a `text` dataset of long inputs. Usually, the SFTrainer will truncate it to fit the specified context size.
However, I get an error when using…
-
If I try to train a model with 7 stems for example, I get:
RuntimeError: The size of tensor a (2) must match the size of tensor b (7) at non-singleton dimension 1
-
## Background
`backendtrain/ops/*Layer` has extra(auxiliary) tensors used for `backward()`.
For example,
https://github.com/Samsung/ONE/blob/60683ad7293d2a18a7939dc49bcea77d8e09b352/runtime/…
-
Following the notebook file, I tried to construct a DMPNN model for tox21.
Training for some reason takes way longer than the pytorch implementation onn chemprop.
So I tried to parallelize it in mul…
-
Hi. Something (possibly not unsloth) changed between July and now.
I am getting an unexpected OOM error trying to do a LORA finetune. This worked before, but is now barfing.
Looked at #338, but not…
-
https://arxiv.org/pdf/2108.00089.pdf
-
Trying to run train.py and it doesn't work. Looks like neural network architecture is incompatible with some constrains? Is a specific version of torch library required?
```
[2024-10-04 17:46:10,588…
-
### 软件环境
```Markdown
- paddlepaddle:
- paddlepaddle-gpu: 3.0.0b1
- paddlenlp: https://github.com/ZHUI/PaddleNLP/tree/sci/benchmark
```
### 重复问题
- [X] I have searched the existing issues
### 错误描…