bug: Model loading request timeout when uploading documents

Jan version

0.5.8-731

Describe the Bug

I have encountered model loading failures 2 times with timeout errors when attempting to load models. The request to POST http://127.0.0.1:39291/v1/models/start times out consistently. Not with just Llama 8B, but also 3B.

wdw

Model loading requests are timing out during initialization with the error:

Request timed out: POST http://127.0.0.1:39291/v1/models/start

Server logs:

20241120 12:01:07.411Z [CORTEX]:: Spawning cortex subprocess...
20241120 12:01:07.411Z [CORTEX]:: Spawn cortex at path: /Users/han/Library/Application Support/Jan-nightly/data/extensions/@janhq/inference-cortex-extension/dist/bin/cortex-server
20241120 12:01:07.412Z [CORTEX]: Engine variant: mac-arm64

Steps to Reproduce

Sending a document, attempt to load Llama 3.1 8B Instruct Q4 model to answer
Request sent to /v1/models/start endpoint
Request times out after extended period
Error displayed about model loading failure

Screenshots / Logs

cortex.log app.log

OS: macOS (Darwin Kernel Version 23.2.0) Hardware: Apple M2 Jan Version: v0.5.8-731 Memory: 16GB Total Model: Llama 3.1 8B Instruct Q4 Cortex Version: v1.0.3-rc5

What is your OS?

[X] MacOS
[ ] Windows
[ ] Linux

janhq / jan

bug: Model loading request timeout when uploading documents #4056

Jan version

Describe the Bug

Steps to Reproduce

Screenshots / Logs

What is your OS?