fix: do not print perf stat when NaN

pytorch / torchchat

Run PyTorch LLMs locally on servers, desktop and mobile

BSD 3-Clause "New" or "Revised" License

3.4k stars 224 forks source link

fix: do not print perf stat when NaN #1375

Closed leseb closed 1 week ago

leseb commented 1 week ago

b4153260 fix: do not print perf stat when NaN

commit b41532608d0f1648285d7794eda4331b1cfb297f Author: Sébastien Han seb@redhat.com Date: Thu Nov 14 11:04:47 2024 +0100

fix: do not print perf stat when NaN

If the chat is exited or interrupted it will still print the stats with
NaN values which is unnecessary.

Signed-off-by: Sébastien Han <seb@redhat.com>

pytorch-bot[bot] commented 1 week ago

:link: Helpful Links

:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1375

:page_facing_up: Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

:heavy_exclamation_mark: 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

GLIBC not found in Nova workflows

:white_check_mark: No Failures

As of commit b41532608d0f1648285d7794eda4331b1cfb297f with merge base 46977645de6e9e29e58fada7d600c1930ed6f67b (): :green_heart: Looks good so far! There are no failures yet. :green_heart:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Jack-Khuu commented 1 week ago

PR looks good, but can you share an example of when this would get triggered (i.e. when are we seeing NaN via manually kill)?

leseb commented 1 week ago

PR looks good, but can you share an example of when this would get triggered (i.e. when are we seeing NaN via manually kill)?

$ python3.10 torchchat.py chat llama3.1 
NumExpr defaulting to 12 threads.
PyTorch version 2.6.0.dev20241002 available.
lm_eval is not installed, GPTQ may not be usable
Using device=mps 
Loading model...
Time to load model: 15.06 seconds
-----------------------------------------------------------
Starting Interactive Chat
Entering Chat Mode. Will continue chatting back and forth with the language model until the models max context length of 8192 tokens is hit or until the user says /bye
Do you want to enter a system prompt? Enter y for yes and anything else for no. 

User: /bye
Exiting Chat.

      Average tokens/sec (total): nan                 
Average tokens/sec (first token): nan                 
Average tokens/sec (next tokens): nan