NVIDIA / nim-anywhere

Accelerate your Gen AI with NVIDIA NIM and NVIDIA AI Workbench
https://www.nvidia.com/en-us/ai/
Apache License 2.0
111 stars 64 forks source link

RERANKER NIM running locally not working ->chain server error #58

Open sschaber81 opened 1 month ago

sschaber81 commented 1 month ago

Unable to use RERANKER NIM locally as running a prompt ends up in chain server issue. RERANKER NIM log shows the following: 2024-10-25T13:14:09Z ERROR: root - Uncaught InferenceServerException: [StatusCode.INTERNAL] in ensemble 'nvidia_nv_rerankqa_mistral_4b_v3', Input length 1024 exceeds maximum allowed token size 512; Request: <starlette.requests.Request object at 0x7f07e32b7730> 2024-10-25T13:14:09Z ERROR: root - Uncaught InferenceServerException: [StatusCode.INTERNAL] in ensemble 'nvidia_nv_rerankqa_mistral_4b_v3', Input length 1024 exceeds maximum allowed token size 512; Request: <starlette.requests.Request object at 0x7f07e32b7730> 2024-10-25T13:14:09Z ERROR: root - Invalid request error: in ensemble 'nvidia_nv_rerankqa_mistral_4b_v3', Input length 1024 exceeds maximum allowed token size 512 2024-10-25T13:14:09Z ERROR: root - Invalid request error: in ensemble 'nvidia_nv_rerankqa_mistral_4b_v3', Input length 1024 exceeds maximum allowed token size 512 2024-10-25T13:14:09Z INFO: uvicorn.access - 172.18.0.3:56732 - "POST /v1/ranking HTTP/1.1" 400 2024-10-25T13:14:09Z INFO: uvicorn.access - 172.18.0.3:56732 - "POST /v1/ranking HTTP/1.1" 400 172.18.0.3:56732 - "POST /v1/ranking HTTP/1.1" 400