Server Startup Issue - Githubissues

ItsCRC commented 2 months ago

Hi All,

I am trying to run cargo run --release and I get the following: Finished release profile [optimized] target(s) in 0.21s Running target/release/ai00_server 2024-05-04T04:35:17.007Z INFO [ai00_server] reading config assets/configs/Config.toml... 2024-05-04T04:35:17.008Z INFO [ai00_server::middleware] ModelInfo { version: V5, num_layer: 32, num_emb: 2560, num_hidden: 8960, num_vocab: 65536, num_head: 40, time_mix_adapter_size: 0, time_decay_adapter_size: 0, } 2024-05-04T04:35:17.008Z INFO [ai00_server::middleware] type: SafeTensors 2024-05-04T04:35:17.013Z WARN [wgpu_hal::gles::egl] EGL_MESA_platform_surfaceless not available. Using default platform 2024-05-04T04:35:17.027Z WARN [wgpu_hal::gles::egl] No config found! 2024-05-04T04:35:17.027Z WARN [wgpu_hal::gles::egl] No config found! 2024-05-04T04:35:17.048Z INFO [ai00_server] server started at 0.0.0.0:65530 with tls 2024-05-04T04:35:17.071Z ERROR [ai00_server::middleware] reload model failed: failed to request device

Currently, it is Cpu ins Config.toml. Also, I have Nvidia 22C (24GB) GPU and it gives the same error when I set it to Gpu. Result of lsb_release -a:

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.6 LTS
Release:        18.04
Codename:       bionic

Config.toml:

[model]
embed_device = "Cpu"                                       # Device to put the embed tensor ("Cpu" or "Gpu").
max_batch = 8                                              # The maximum batches that are cached on GPU.
model_name = "RWKV-5-World-3B-v2-20231113-ctx4096.st" # Name of the model.
model_path = "assets/models"                               # Path to the folder containing all models.
quant = 32                                                  # Layers to be quantized.
quant_type = "Int8"                                        # Quantization type ("Int8" or "NF4").
stop = ["\n\n"]                                            # Additional stop words in generation.
token_chunk_size = 128                                     # Size of token chunk that is inferred at once. For high end GPUs, this could be 64 or 128 (faster).

Can anyone help? Thanks

cryscan commented 2 months ago

Ai00 requires Vulkan. Does your GPU support Vulkan and Do you have Vulkan driver installed? You may check vulkaninfo.

ItsCRC commented 2 months ago

I just checked. Vulkan drivers are not installed. Let me install and then come back.

ItsCRC commented 1 month ago

@cryscan I have A10 G Nvidia card and driver version is 515.65.01 with CUDA 11.7. I want to know how can I install vulkan drivers? I don't seem to find any good article to install them.

cgisky1980 commented 1 month ago

https://developer.nvidia.com/vulkan-driver https://docs.nvidia.com/grid/13.0/grid-vgpu-user-guide/index.html https://browser.geekbench.com/vulkan-benchmarks

lingfengchencn commented 1 month ago

==========
VULKANINFO
==========

Vulkan Instance Version: 1.3.283

the some issues

2024-05-16T11:06:22.752Z INFO  [ai00_server::middleware] type: SafeTensors
error: XDG_RUNTIME_DIR not set in the environment.
error: XDG_RUNTIME_DIR not set in the environment.
2024-05-16T11:06:22.802Z ERROR [ai00_server::middleware] reload model failed: failed to request device
2024-05-16T11:06:22.807Z INFO  [ai00_server] server started at 0.0.0.0:65530 with tls

Ai00-X / ai00_server

Server Startup Issue #110