zero loss when training

I have executed

python train_mamba_with_context.py --model state-spaces/mamba-130m \
   --data_path data/Mamba-Fine-Tune/squad_train.jsonl \
   --output models/mamba-130m-context \
   --num_epochs 10

But soon after it goes to zero:

{'loss': 2.9325, 'learning_rate': 0.0004995433789954337, 'epoch': 0.01}                                            
{'loss': 0.0, 'learning_rate': 0.0004990867579908676, 'epoch': 0.02}                                               
{'loss': 0.0, 'learning_rate': 0.0004986301369863013, 'epoch': 0.03}                                               
{'loss': 0.0, 'learning_rate': 0.0004981735159817352, 'epoch': 0.04}

Then the model does not train. I have experimented with smaller lr, with the same result.

Oxen-AI / mamba-dive

zero loss when training #2