Open jcrangel opened 7 months ago
I have executed
python train_mamba_with_context.py --model state-spaces/mamba-130m \ --data_path data/Mamba-Fine-Tune/squad_train.jsonl \ --output models/mamba-130m-context \ --num_epochs 10
But soon after it goes to zero:
{'loss': 2.9325, 'learning_rate': 0.0004995433789954337, 'epoch': 0.01} {'loss': 0.0, 'learning_rate': 0.0004990867579908676, 'epoch': 0.02} {'loss': 0.0, 'learning_rate': 0.0004986301369863013, 'epoch': 0.03} {'loss': 0.0, 'learning_rate': 0.0004981735159817352, 'epoch': 0.04}
Then the model does not train. I have experimented with smaller lr, with the same result.
Hi, I also encountered the same problem. Have you found a solution? Thank you.
I have executed
But soon after it goes to zero:
Then the model does not train. I have experimented with smaller lr, with the same result.