jzhang38 / EasyContext

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
Apache License 2.0
529 stars 33 forks source link

May I see your wandb report while training? #34

Open fahadh4ilyas opened 1 month ago

fahadh4ilyas commented 1 month ago

I'm currently training llama3 and the loss dan perplexity result seems stagnant. May I see your loss and perplexity result when training llama2?