I1106 18:30:12.872196 138308608845376 warmup_utils.py:108] ---------Prefill engine 0 compiled for prefill length 512.---------
2024-11-06 18:30:12,872 - root - INFO - ---------Prefill engine 0 compiled for prefill length 512.---------
2024-11-06 18:30:13,011 - root - INFO - ---------Prefill engine 0 compiled for prefill length 64.---------
I1106 18:30:13.011122 138310185887296 warmup_utils.py:108] ---------Prefill engine 0 compiled for prefill length 64.---------
I1106 18:30:16.064429 138308617238080 warmup_utils.py:108] ---------Prefill engine 0 compiled for prefill length 256.---------
2024-11-06 18:30:16,064 - root - INFO - ---------Prefill engine 0 compiled for prefill length 256.---------
...
2024-11-06 18:30:32,060 - root - INFO - ---------Generate params 0 loaded.---------
curl --request POST --header "Content-type: application/json" -s localhost:8000/generate --data '{
"prompt": "What are the top 5 programming languages",
"max_tokens": 200
}'
{
"response": " for for data science in 2023?\n\n1. Python\n2. R\n3. SQL\n4. Java\n5. Scala\n\n**Note:** The order is based on popularity and demand in the data science industry in 2023."
}
Adds --enable_model_warmup flag per https://github.com/AI-Hypercomputer/jetstream-pytorch/pull/187