jzhang38 / EasyContext

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
Apache License 2.0
529 stars 33 forks source link

How to auto-regression generate? #33

Open yileld opened 1 month ago

yileld commented 1 month ago

In eval_needle.py, it gather( ) and undo_extract_local( ) the preds to get the whole preds, then get pred token by prompt_length. In auto-regression mode, I just need next token, can I just get the pred token without gather( ) and undo_extract_local( )?