Open sschaber81 opened 1 month ago
Unable to use RERANKER NIM locally as running a prompt ends up in chain server issue. RERANKER NIM log shows the following: 2024-10-25T13:14:09Z ERROR: root - Uncaught InferenceServerException: [StatusCode.INTERNAL] in ensemble 'nvidia_nv_rerankqa_mistral_4b_v3', Input length 1024 exceeds maximum allowed token size 512; Request: <starlette.requests.Request object at 0x7f07e32b7730> 2024-10-25T13:14:09Z ERROR: root - Uncaught InferenceServerException: [StatusCode.INTERNAL] in ensemble 'nvidia_nv_rerankqa_mistral_4b_v3', Input length 1024 exceeds maximum allowed token size 512; Request: <starlette.requests.Request object at 0x7f07e32b7730> 2024-10-25T13:14:09Z ERROR: root - Invalid request error: in ensemble 'nvidia_nv_rerankqa_mistral_4b_v3', Input length 1024 exceeds maximum allowed token size 512 2024-10-25T13:14:09Z ERROR: root - Invalid request error: in ensemble 'nvidia_nv_rerankqa_mistral_4b_v3', Input length 1024 exceeds maximum allowed token size 512 2024-10-25T13:14:09Z INFO: uvicorn.access - 172.18.0.3:56732 - "POST /v1/ranking HTTP/1.1" 400 2024-10-25T13:14:09Z INFO: uvicorn.access - 172.18.0.3:56732 - "POST /v1/ranking HTTP/1.1" 400 172.18.0.3:56732 - "POST /v1/ranking HTTP/1.1" 400
Unable to use RERANKER NIM locally as running a prompt ends up in chain server issue. RERANKER NIM log shows the following: 2024-10-25T13:14:09Z ERROR: root - Uncaught InferenceServerException: [StatusCode.INTERNAL] in ensemble 'nvidia_nv_rerankqa_mistral_4b_v3', Input length 1024 exceeds maximum allowed token size 512; Request: <starlette.requests.Request object at 0x7f07e32b7730> 2024-10-25T13:14:09Z ERROR: root - Uncaught InferenceServerException: [StatusCode.INTERNAL] in ensemble 'nvidia_nv_rerankqa_mistral_4b_v3', Input length 1024 exceeds maximum allowed token size 512; Request: <starlette.requests.Request object at 0x7f07e32b7730> 2024-10-25T13:14:09Z ERROR: root - Invalid request error: in ensemble 'nvidia_nv_rerankqa_mistral_4b_v3', Input length 1024 exceeds maximum allowed token size 512 2024-10-25T13:14:09Z ERROR: root - Invalid request error: in ensemble 'nvidia_nv_rerankqa_mistral_4b_v3', Input length 1024 exceeds maximum allowed token size 512 2024-10-25T13:14:09Z INFO: uvicorn.access - 172.18.0.3:56732 - "POST /v1/ranking HTTP/1.1" 400 2024-10-25T13:14:09Z INFO: uvicorn.access - 172.18.0.3:56732 - "POST /v1/ranking HTTP/1.1" 400 172.18.0.3:56732 - "POST /v1/ranking HTTP/1.1" 400