Open amirsohail208 opened 1 week ago
log
[2024-06-28 15:00:36.897] [wasi_logging_stdout] [info] server_config: rag_api_server in src/main.rs:132: server_version: 0.7.0 [2024-06-28 15:00:36.897] [wasi_logging_stdout] [info] server_config: rag_api_server in src/main.rs:140: model_name: Meta-Llama-3-8B-Instruct-Q5_K_M,nomic-embed-text-v1.5.f16 [2024-06-28 15:00:36.897] [wasi_logging_stdout] [info] server_config: rag_api_server in src/main.rs:148: model_alias: default,embedding [2024-06-28 15:00:36.897] [wasi_logging_stdout] [info] server_config: rag_api_server in src/main.rs:162: ctx_size: 4096,512 [2024-06-28 15:00:36.897] [wasi_logging_stdout] [info] server_config: rag_api_server in src/main.rs:176: batch_size: 4096,512 [2024-06-28 15:00:36.897] [wasi_logging_stdout] [info] server_config: rag_api_server in src/main.rs:190: prompt_template: llama-3-chat,embedding [2024-06-28 15:00:36.897] [wasi_logging_stdout] [info] server_config: rag_api_server in src/main.rs:199: rag_prompt: You are a tour guide in London, UK. Use information in the following context to directly answer the question from a London visitor.\n----------------\n [2024-06-28 15:00:36.897] [wasi_logging_stdout] [info] server_config: rag_api_server in src/main.rs:220: qdrant_url: http://127.0.0.1:6333 [2024-06-28 15:00:36.897] [wasi_logging_stdout] [info] server_config: rag_api_server in src/main.rs:223: qdrant_collection_name: default [2024-06-28 15:00:36.897] [wasi_logging_stdout] [info] server_config: rag_api_server in src/main.rs:226: qdrant_limit: 1 [2024-06-28 15:00:36.898] [wasi_logging_stdout] [info] server_config: rag_api_server in src/main.rs:229: qdrant_score_threshold: 0.5 [2024-06-28 15:00:36.898] [wasi_logging_stdout] [info] server_config: rag_api_server in src/main.rs:240: chunk_capacity: 100 [2024-06-28 15:00:36.898] [wasi_logging_stdout] [info] server_config: rag_api_server in src/main.rs:243: rag_policy: system-message [2024-06-28 15:00:36.898] [wasi_logging_stdout] [info] llama-core: llama_core in /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/llama-core-0.11.3/src/lib.rs:493: Initializing the core context for RAG scenarios [2024-06-28 15:00:36.900] [info] [WASI-NN] GGML backend: LLAMA_COMMIT 3d7ebf63 [2024-06-28 15:00:36.900] [info] [WASI-NN] GGML backend: LLAMA_BUILD_NUMBER 3075 [2024-06-28 15:00:36.978] [info] [WASI-NN] llama.cpp: llama_model_loader: loaded meta data with 21 key-value pairs and 291 tensors from Meta-Llama-3-8B-Instruct-Q5_K_M.gguf (version GGUF V3 (latest)) [2024-06-28 15:00:36.978] [info] [WASI-NN] llama.cpp: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. [2024-06-28 15:00:36.978] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 0: general.architecture str = llama [2024-06-28 15:00:36.978] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 1: general.name str = Meta-Llama-3-8B-Instruct [2024-06-28 15:00:36.978] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 2: llama.block_count u32 = 32 [2024-06-28 15:00:36.978] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 3: llama.context_length u32 = 8192 [2024-06-28 15:00:36.978] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 4: llama.embedding_length u32 = 4096 [2024-06-28 15:00:36.978] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 5: llama.feed_forward_length u32 = 14336 [2024-06-28 15:00:36.978] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 6: llama.attention.head_count u32 = 32 [2024-06-28 15:00:36.978] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 7: llama.attention.head_count_kv u32 = 8 [2024-06-28 15:00:36.978] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 8: llama.rope.freq_base f32 = 500000.000000 [2024-06-28 15:00:36.978] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000010 [2024-06-28 15:00:36.978] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 10: general.file_type u32 = 17 [2024-06-28 15:00:36.978] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 11: llama.vocab_size u32 = 128256 [2024-06-28 15:00:36.978] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 12: llama.rope.dimension_count u32 = 128 [2024-06-28 15:00:36.978] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 13: tokenizer.ggml.model str = gpt2 [2024-06-28 15:00:36.996] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 14: tokenizer.ggml.tokens arr[str,128256] = ["!", "\"", "#", "$", "%", "&", "'", ... [2024-06-28 15:00:37.005] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 15: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... [2024-06-28 15:00:37.040] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 16: tokenizer.ggml.merges arr[str,280147] = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "... [2024-06-28 15:00:37.040] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 17: tokenizer.ggml.bos_token_id u32 = 128000 [2024-06-28 15:00:37.040] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 18: tokenizer.ggml.eos_token_id u32 = 128001 [2024-06-28 15:00:37.040] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 19: tokenizer.chat_template str = {% set loop_messages = messages %}{% ... [2024-06-28 15:00:37.040] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 20: general.quantization_version u32 = 2 [2024-06-28 15:00:37.040] [info] [WASI-NN] llama.cpp: llama_model_loader: - type f32: 65 tensors [2024-06-28 15:00:37.040] [info] [WASI-NN] llama.cpp: llama_model_loader: - type q5_K: 193 tensors [2024-06-28 15:00:37.040] [info] [WASI-NN] llama.cpp: llama_model_loader: - type q6_K: 33 tensors [2024-06-28 15:00:37.321] [warning] [WASI-NN] llama.cpp: llm_load_vocab: missing pre-tokenizer type, using: 'default' [2024-06-28 15:00:37.321] [warning] [WASI-NN] llama.cpp: llm_load_vocab: [2024-06-28 15:00:37.321] [warning] [WASI-NN] llama.cpp: llm_load_vocab: **** [2024-06-28 15:00:37.321] [warning] [WASI-NN] llama.cpp: llm_load_vocab: GENERATION QUALITY WILL BE DEGRADED! [2024-06-28 15:00:37.321] [warning] [WASI-NN] llama.cpp: llm_load_vocab: CONSIDER REGENERATING THE MODEL [2024-06-28 15:00:37.321] [warning] [WASI-NN] llama.cpp: llm_load_vocab: **** [2024-06-28 15:00:37.321] [warning] [WASI-NN] llama.cpp: llm_load_vocab: [2024-06-28 15:00:37.379] [info] [WASI-NN] llama.cpp: llm_load_vocab: special tokens cache size = 256 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_vocab: token to piece cache size = 0.8000 MB [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: format = GGUF V3 (latest) [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: arch = llama [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: vocab type = BPE [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_vocab = 128256 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_merges = 280147 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_ctx_train = 8192 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_embd = 4096 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_head = 32 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_head_kv = 8 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_layer = 32 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_rot = 128 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_embd_head_k = 128 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_embd_head_v = 128 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_gqa = 4 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_embd_k_gqa = 1024 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_embd_v_gqa = 1024 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: f_norm_eps = 0.0e+00 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: f_norm_rms_eps = 1.0e-05 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: f_clamp_kqv = 0.0e+00 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: f_max_alibi_bias = 0.0e+00 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: f_logit_scale = 0.0e+00 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_ff = 14336 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_expert = 0 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_expert_used = 0 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: causal attn = 1 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: pooling type = 0 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: rope type = 0 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: rope scaling = linear [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: freq_base_train = 500000.0 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: freq_scale_train = 1 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_yarn_orig_ctx = 8192 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: rope_finetuned = unknown [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: ssm_d_conv = 0 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: ssm_d_inner = 0 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: ssm_d_state = 0 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: ssm_dt_rank = 0 [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: model type = 8B [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: model ftype = Q5_K - Medium [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: model params = 8.03 B [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: model size = 5.33 GiB (5.70 BPW) [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: general.name = Meta-Llama-3-8B-Instruct [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: BOS token = 128000 '<|begin_of_text|>' [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: EOS token = 128001 '<|end_of_text|>' [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: LF token = 128 'Ä' [2024-06-28 15:00:37.446] [info] [WASI-NN] llama.cpp: llm_load_print_meta: EOT token = 128009 '<|eot_id|>' [2024-06-28 15:00:37.452] [info] [WASI-NN] llama.cpp: llm_load_tensors: ggml ctx size = 0.15 MiB [2024-06-28 15:05:04.833] [info] [WASI-NN] llama.cpp: llm_load_tensors: CPU buffer size = 5459.93 MiB [2024-06-28 15:05:04.839] [info] [WASI-NN] llama.cpp: [2024-06-28 15:05:05.303] [info] [WASI-NN] GGML backend: llama_system_info: AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 | [2024-06-28 15:05:05.317] [info] [WASI-NN] GGML backend: LLAMA_COMMIT 3d7ebf63 [2024-06-28 15:05:05.317] [info] [WASI-NN] GGML backend: LLAMA_BUILD_NUMBER 3075 [2024-06-28 15:05:05.352] [info] [WASI-NN] llama.cpp: llama_model_loader: loaded meta data with 22 key-value pairs and 112 tensors from nomic-embed-text-v1.5.f16.gguf (version GGUF V3 (latest)) [2024-06-28 15:05:05.352] [info] [WASI-NN] llama.cpp: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. [2024-06-28 15:05:05.352] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 0: general.architecture str = nomic-bert [2024-06-28 15:05:05.352] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 1: general.name str = nomic-embed-text-v1.5 [2024-06-28 15:05:05.352] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 2: nomic-bert.block_count u32 = 12 [2024-06-28 15:05:05.352] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 3: nomic-bert.context_length u32 = 2048 [2024-06-28 15:05:05.352] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 4: nomic-bert.embedding_length u32 = 768 [2024-06-28 15:05:05.352] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 5: nomic-bert.feed_forward_length u32 = 3072 [2024-06-28 15:05:05.352] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 6: nomic-bert.attention.head_count u32 = 12 [2024-06-28 15:05:05.352] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 7: nomic-bert.attention.layer_norm_epsilon f32 = 0.000000 [2024-06-28 15:05:05.352] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 8: general.file_type u32 = 1 [2024-06-28 15:05:05.352] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 9: nomic-bert.attention.causal bool = false [2024-06-28 15:05:05.352] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 10: nomic-bert.pooling_type u32 = 1 [2024-06-28 15:05:05.352] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 11: nomic-bert.rope.freq_base f32 = 1000.000000 [2024-06-28 15:05:05.352] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 12: tokenizer.ggml.token_type_count u32 = 2 [2024-06-28 15:05:05.352] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 13: tokenizer.ggml.bos_token_id u32 = 101 [2024-06-28 15:05:05.352] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 14: tokenizer.ggml.eos_token_id u32 = 102 [2024-06-28 15:05:05.352] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 15: tokenizer.ggml.model str = bert [2024-06-28 15:05:05.356] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 16: tokenizer.ggml.tokens arr[str,30522] = ["[PAD]", "[unused0]", "[unused1]", "... [2024-06-28 15:05:05.364] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 17: tokenizer.ggml.scores arr[f32,30522] = [-1000.000000, -1000.000000, -1000.00... [2024-06-28 15:05:05.366] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 18: tokenizer.ggml.token_type arr[i32,30522] = [3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... [2024-06-28 15:05:05.366] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 19: tokenizer.ggml.unknown_token_id u32 = 100 [2024-06-28 15:05:05.366] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 20: tokenizer.ggml.seperator_token_id u32 = 102 [2024-06-28 15:05:05.366] [info] [WASI-NN] llama.cpp: llama_model_loader: - kv 21: tokenizer.ggml.padding_token_id u32 = 0 [2024-06-28 15:05:05.367] [info] [WASI-NN] llama.cpp: llama_model_loader: - type f32: 51 tensors [2024-06-28 15:05:05.367] [info] [WASI-NN] llama.cpp: llama_model_loader: - type f16: 61 tensors [2024-06-28 15:05:05.382] [info] [WASI-NN] llama.cpp: llm_load_vocab: special tokens cache size = 5 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_vocab: token to piece cache size = 0.2032 MB [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: format = GGUF V3 (latest) [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: arch = nomic-bert [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: vocab type = WPM [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_vocab = 30522 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_merges = 0 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_ctx_train = 2048 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_embd = 768 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_head = 12 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_head_kv = 12 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_layer = 12 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_rot = 64 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_embd_head_k = 64 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_embd_head_v = 64 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_gqa = 1 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_embd_k_gqa = 768 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_embd_v_gqa = 768 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: f_norm_eps = 1.0e-12 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: f_norm_rms_eps = 0.0e+00 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: f_clamp_kqv = 0.0e+00 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: f_max_alibi_bias = 0.0e+00 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: f_logit_scale = 0.0e+00 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_ff = 3072 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_expert = 0 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_expert_used = 0 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: causal attn = 0 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: pooling type = 1 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: rope type = 2 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: rope scaling = linear [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: freq_base_train = 1000.0 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: freq_scale_train = 1 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: n_yarn_orig_ctx = 2048 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: rope_finetuned = unknown [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: ssm_d_conv = 0 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: ssm_d_inner = 0 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: ssm_d_state = 0 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: ssm_dt_rank = 0 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: model type = 137M [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: model ftype = F16 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: model params = 136.73 M [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: model size = 260.86 MiB (16.00 BPW) [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: general.name = nomic-embed-text-v1.5 [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: BOS token = 101 '[CLS]' [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: EOS token = 102 '[SEP]' [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: UNK token = 100 '[UNK]' [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: SEP token = 102 '[SEP]' [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: PAD token = 0 '[PAD]' [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: CLS token = 101 '[CLS]' [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: MASK token = 103 '[MASK]' [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_print_meta: LF token = 0 '[PAD]' [2024-06-28 15:05:05.392] [info] [WASI-NN] llama.cpp: llm_load_tensors: ggml ctx size = 0.06 MiB [2024-06-28 15:05:21.104] [info] [WASI-NN] llama.cpp: llm_load_tensors: CPU buffer size = 260.86 MiB [2024-06-28 15:05:21.104] [info] [WASI-NN] llama.cpp: [2024-06-28 15:05:21.105] [info] [WASI-NN] GGML backend: llama_system_info: AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 | [2024-06-28 15:05:21.118] [wasi_logging_stdout] [info] llama-core: llama_core in /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/llama-core-0.11.3/src/lib.rs:548: running mode: rag [2024-06-28 15:05:21.119] [wasi_logging_stdout] [info] llama-core: llama_core in /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/llama-core-0.11.3/src/lib.rs:561: The core context for RAG scenarios has been initialized [2024-06-28 15:05:21.119] [wasi_logging_stdout] [info] llama-core: llama_core in /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/llama-core-0.11.3/src/lib.rs:571: Getting the plugin info [2024-06-28 15:05:21.119] [wasi_logging_stdout] [info] llama-core: llama_core in /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/llama-core-0.11.3/src/lib.rs:757: Get the running mode. [2024-06-28 15:05:21.119] [wasi_logging_stdout] [info] llama-core: llama_core in /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/llama-core-0.11.3/src/lib.rs:782: running mode: rag [2024-06-28 15:05:21.119] [wasi_logging_stdout] [info] llama-core: llama_core in /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/llama-core-0.11.3/src/lib.rs:651: Getting the plugin info by the graph named Meta-Llama-3-8B-Instruct-Q5_K_M [2024-06-28 15:05:21.119] [wasi_logging_stdout] [info] llama-core: llama_core::utils in /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/llama-core-0.11.3/src/utils.rs:158: Get the output buffer generated by the model named Meta-Llama-3-8B-Instruct-Q5_K_M in the non-stream mode. [2024-06-28 15:05:21.138] [wasi_logging_stdout] [info] llama-core: llama_core::utils in /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/llama-core-0.11.3/src/utils.rs:176: Output buffer size: 95 [2024-06-28 15:05:21.139] [wasi_logging_stdout] [info] llama-core: llama_core in /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/llama-core-0.11.3/src/lib.rs:711: Plugin info: b3075(commit 3d7ebf63) [2024-06-28 15:05:21.140] [wasi_logging_stdout] [info] server_config: rag_api_server in src/main.rs:343: plugin_ggml_version: b3075 (commit 3d7ebf63) [2024-06-28 15:05:21.141] [wasi_logging_stdout] [info] server_config: rag_api_server in src/main.rs:353: socket_address: 0.0.0.0:8080 [2024-06-28 15:05:21.142] [wasi_logging_stdout] [info] server_config: rag_api_server in src/main.rs:360: gaianet_node_version: 0.1.2
@apepkuss Could you help check it out?
Hey, I faced same issue some time before but re-running curl -sSfL 'https://github.com/GaiaNet-AI/gaianet-node/releases/latest/download/install.sh' | bash
seemed to solve it for me.
If issue still persists I recommend reinstalling and upgrading gaianet running curl -sSfL 'https://github.com/GaiaNet-AI/gaianet-node/releases/latest/download/install.sh' | bash -s -- --reinstall
Note: when you run the reinstall command the gaianet folder will reset so backup the models or any important info in that folder before
after start its stop automatically no error still not getting url