janhq / cortex.cpp

Local AI API Platform
https://cortex.so
Apache License 2.0
2.04k stars 115 forks source link

bug: Arch Detector Causing Special Characters in GPU Name in `settings.json` #1140

Open Van-QA opened 2 months ago

Van-QA commented 2 months ago

Description: The arch detector in the system is not functioning properly, leading to the introduction of special characters in the GPU name within the settings.json file. Specifically, the GPU name is populated as "NVIDIA RTX A6000\r", where the carriage return (\r) is incorrectly included. This issue seems to originate from the GPU architecture detection logic.

Steps to Reproduce:

  1. Run the system and allow it to generate the settings.json file.
  2. Observe the settings.json file, particularly the GPU section:
    "nvidia_driver": {
       "exist": true,
       "version": "560.76",
       "name": "NVIDIA RTX A6000\r"
    }
  3. Note the presence of the special character (\r) in the GPU name.

Expected Result: The settings.json file should contain a clean, correctly formatted GPU name without any special characters.

Actual Result: The settings.json file contains a GPU name with an unintended special character (\r), likely due to a flaw in the arch detector logic.

Reported by: cbai970 https://discord.com/channels/1107178041848909847/1277272158367776828/1277291486198894657

cbai970 commented 2 months ago

im subbing to this, when you get a solution, Ill retest on that same platform.

cbai970 commented 2 months ago

Im happy to help move this forward but I dont know the full .js tree and how it gets created to the point where we hit this bug. Ive been been reading through as much Tensorflow and Tensorflow-LLM documentation as I can but not entirely sure.

If i could have someone tell me how we end up getting to "index.js" in the spawning of the tree, Ill do my best to post what I find (if it helps) , I am under the assumption that this function ends up calling a "compile from source against Arch(itecture)" thread, but maybe Im wrong.

give me a little friends, and Ill give you a lot :) p.s. I am a former defect research engineer...

louis-jan commented 2 months ago

Root cause: The extension detects only GPU models that start with 30 or 40. Which is incorrect, should cover other cases (E.g. Axx)

louis-jan commented 2 months ago

Is a possible fix coming from cortex-cpp? cc @imtuyethan @Van-QA

cbai970 commented 2 months ago

Root cause: The extension detects only GPU models that start with 30 or 40. Which is incorrect, should cover other cases (E.g. Axx)

I would go with CUDA compute levels, unless there are additional architectural reasons not to, but I could not find any.

I was actually trying to figure out how to code myself, but (again) im not a programmer kind of afraid of breaking stuff, Also... this is all gets autogenerated when Jan first executes (and regens when you reset, because i definitely tested that) so Im not sure what root this all comes out from.