Open mandalrajiv opened 4 weeks ago
@chensuyue - Can you please take a look and advise?
@chensuyue - Can you please take a look and advise?
For the connect issue, it usually means the service require more time to start, you can check the docker logs xxx
to confirm.
I haven't seem that random output issue in the CI test, did you use the latest GenAIComps code to build up the microservice?
You may need a --no-cache
for the docker build.
And invite @letonghan to give some comments.
I will check the docker logs and respond back.
I cloned the latest repo today before starting the test.
Do I need to rebuild the docker image with —no-cache?
On May 29, 2024, at 10:10 AM, chen, suyue @.***> wrote:
@chensuyuehttps://github.com/chensuyue - Can you please take a look and advise?
For the connect issue, it usually means the service require more time to start, you can check the docker logs xxx to confirm. I haven't seem that random output issue in the CI test, did you use the latest GenAIComps code to build up the microservice? You may need a --no-cache for the docker build.
And invite @letonghanhttps://github.com/letonghan to give some comments.
— Reply to this email directly, view it on GitHubhttps://github.com/opea-project/GenAIExamples/issues/216#issuecomment-2137894723, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AVVJU7QJ2DSWJUTYNR6JUF3ZEYDYLAVCNFSM6AAAAABIPJ2MTGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZXHA4TINZSGM. You are receiving this because you authored the thread.Message ID: @.***>
I checked the docker logs with the command "docker logs tgi-gaudi-server". In the logs, I see a successful response. Pasting docker logs below.
2024-05-29T21:45:58.893307Z WARN text_generation_router: router/src/main.rs:260: Invalid hostname, defaulting to 0.0.0.0 2024-05-29T22:02:08.339627Z INFO generate{parameters=GenerateParameters { best_of: None, temperature: None, repetition_penalty: None, top_k: None, top_p: None, typical_p: None, do_sample: true, max_new_tokens: Some(64), return_full_text: None, stop: [], truncate: None, watermark: false, details: false, decoder_input_details: false, seed: None, top_n_tokens: None } total_time="4.937955875s" validation_time="294.95µs" queue_time="78.34µs" inference_time="4.937582726s" time_per_token="77.14973ms" seed="Some(6240944214322899185)"}: text_generation_router::server: router/src/server.rs:289: Success
I ran the curl command again, still see output looks like junk.
ubuntu@ip-172-31-90-59:~/GenAIExamples/ChatQnA/tests$ curl http://184.73.148.255:8008/generate -X POST -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":64, "do_sample": true}}' -H 'Content-Type: application/json'
{"generated_text":"meals meals cb desperateRED bordersредиarticle groundredit wins� baselineandsент aux fashionpositeatteredFilePathcdachelorЖ beschVisibleWL百list ange Hope babiesaware tissue😳 archae blood glimpsezen dio conspiracyARB investors installation fundamentglobal consistentrad doorwayinalPREceryWI smoothющих Creekment Ralph tambémtmp predicted chronLDround squad"}
@letonghan - Can you please take a look as suggested by @chensuyue
Hi @mandalrajiv, thanks for your response. These are the explainations and suggestions to your issues:
curl: (7) Failed to connect to 172.31.90.59 port 8008 after 0 ms: Connection refused
issue:
It takes time wo download model for TGI to start LLm service, and the download time depends on your network condition. This issue may appear when the model is not downloaded yet/the service is not started yet.export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
at line 44 in this script.Thank you @letonghan . I will try with a different model.
I have used the Intel Neural Chat model for LLM inferencing. It produces pretty good response. Not sure why in this case the generated text is not meaningful. If there is any additional insight you can please provide on why that is happening, it would be immensely helpful. Thanks !!
I am testing the ChatQnA Gen AI example using the Gaudi script at - https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/tests/test_chatqna_on_gaudi.sh
The curl command below throws and error saying connection refused. curl http://172.31.90.59:8008/generate -X POST -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":64, "do_sample": true}}' -H 'Content-Type: application/json'
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 curl: (7) Failed to connect to 172.31.90.59 port 8008 after 0 ms: Connection refused
When I change the curl command to the example below, I see output, but the output is not meaningful. curl http://172.31.90.59:8008/generate \
{"generated_text":"discussions discussions++++BMconstructionalt))))nelsDataSource gloves<>( diagonal丁 PRO Delta transitions Http tim search restrict analys WiesserValuesљаdashboard도 birthday suppliers trouve될 pilot сте bit友idential ==ometric witnesses Jewaddmem yy Clubminecraftские improvementsstepAbsolute ottobrewheelُ deutscherizioni Af LookFactor participantaching ip grantspicker autumn"}