[X] I reviewed the Discussions, and have a new bug or useful enhancement to share.
Expected Behavior
I am expecting output from the webui beyond 100 seconds of generation.
Current Behavior
I am trying to generate a 2300+ prompt to test models and their functionality with this service in order to determine the best service for test generation webui running on my homelab. This error appears after submitting a prompt and waiting 100 seconds on the dot. I am expecting loading times to be after 100 seconds for the weight on this prompt. I am trying to make the webui from erroring out at 100 seconds. The terminal on the server's side is showing the prompt and response no problem. It is just the webui portion that is erroring out and not displaying the response. Contextsize: 4096. GGML superhot Pygmalion
Physical (or virtual) hardware you are using, e.g. for Linux:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 25
On-line CPU(s) list: 0-24
Thread(s) per core: 1
Core(s) per socket: 25
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 63
Model name: Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz
Stepping: 2
CPU MHz: 2299.998
BogoMIPS: 4599.99
Virtualization: VT-x
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 800 KiB
L1i cache: 800 KiB
L2 cache: 100 MiB
L3 cache: 16 MiB
NUMA node0 CPU(s): 0-24
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Mitigation; PTE Inversion; VMX flush not necess
ary, SMT disabled
Vulnerability Mds: Mitigation; Clear CPU buffers; SMT Host state u
nknown
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Mmio stale data: Vulnerable: Clear CPU buffers attempted, no mic
rocode; SMT Host state unknown
Vulnerability Retbleed: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled v
ia prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and _user
pointer sanitization
Vulnerability Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS
FW, STIBP disabled, RSB filling, PBRSB-eIBRS No
t affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtr
r pge mca cmov pat pse36 clflush mmx fxsr sse s
se2 ss ht syscall nx pdpe1gb rdtscp lm constant
_tsc arch_perfmon rep_good nopl xtopology cpuid
tsc_known_freq pni pclmulqdq vmx ssse3 fma cx1
6 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt t
sc_deadline_timer aes xsave avx f16c rdrand hyp
ervisor lahf_lm abm cpuid_fault invpcid_single
pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpr
iority ept vpid ept_ad fsgsbase tsc_adjust bmi1
avx2 smep bmi2 erms invpcid xsaveopt arat umip
md_clear arch_capabilities
Operating System, e.g. for Linux:
Linux debian 5.10.0-23-amd64 #1 SMP Debian 5.10.179-1 (2023-05-12) x86_64 GNU/Linux
Bowsers used to test error
Brave: 1.52.126
Chrome: 114.0.5735.199
Firefox: 113.0.1 BUT ERROR IS NOW: Error while submitting prompt: SyntaxError: JSON.parse: unexpected character at line 1 column 1 of the JSON data
SDK version, e.g. for Linux:
Python 3.9.2
GNU Make 4.3
Built for x86_64-pc-linux-gnu
g++ (Debian 10.2.1-6) 10.2.1 20210110
Failure Information (for bugs)
Server: Generate: The response could not be sent, maybe connection was terminated?
Client: Error while submitting prompt: SyntaxError: Unexpected token '<', "DOCTYPE"... is not a valid JSON
Steps to Reproduce
Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.
Build and deploy the service using Nginx and Cloudflare
Load pygmalion-13b-superhot-8k.ggmlv3.q4_K_M.bin
Generate a prompt over 2048 in length.
Failure Logs
username@debian:/media/username/Storage/koboldcpp-1.33$ python3 koboldcpp.py pygmalion-13b-superhot-8k.ggmlv3.q4_K_M.bin 6969 --contextsize 4096
Welcome to KoboldCpp - Version 1.33
Attempting to use OpenBLAS library for faster prompt ingestion. A compatible libopenblas will be required.
Initializing dynamic library: koboldcpp_openblas.so
Sounds like it is a timeout issue with your browser or proxy. Sometimes if a connection is open for too long, it can get terminated. Can you try with a different browser and/or proxy provider?
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
I am expecting output from the webui beyond 100 seconds of generation.
Current Behavior
I am trying to generate a 2300+ prompt to test models and their functionality with this service in order to determine the best service for test generation webui running on my homelab. This error appears after submitting a prompt and waiting 100 seconds on the dot. I am expecting loading times to be after 100 seconds for the weight on this prompt. I am trying to make the webui from erroring out at 100 seconds. The terminal on the server's side is showing the prompt and response no problem. It is just the webui portion that is erroring out and not displaying the response. Contextsize: 4096. GGML superhot Pygmalion
Environment and Context
OS: Debian 11 CPU; Xeon E5-2670v3 (25C Virtualized) RAM: 80GB GPU: None
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 46 bits physical, 48 bits virtual CPU(s): 25 On-line CPU(s) list: 0-24 Thread(s) per core: 1 Core(s) per socket: 25 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 63 Model name: Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz Stepping: 2 CPU MHz: 2299.998 BogoMIPS: 4599.99 Virtualization: VT-x Hypervisor vendor: KVM Virtualization type: full L1d cache: 800 KiB L1i cache: 800 KiB L2 cache: 100 MiB L3 cache: 16 MiB NUMA node0 CPU(s): 0-24 Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Mitigation; PTE Inversion; VMX flush not necess ary, SMT disabled Vulnerability Mds: Mitigation; Clear CPU buffers; SMT Host state u nknown Vulnerability Meltdown: Mitigation; PTI Vulnerability Mmio stale data: Vulnerable: Clear CPU buffers attempted, no mic rocode; SMT Host state unknown Vulnerability Retbleed: Not affected Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled v ia prctl and seccomp Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and _user pointer sanitization Vulnerability Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS FW, STIBP disabled, RSB filling, PBRSB-eIBRS No t affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtr r pge mca cmov pat pse36 clflush mmx fxsr sse s se2 ss ht syscall nx pdpe1gb rdtscp lm constant _tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq vmx ssse3 fma cx1 6 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt t sc_deadline_timer aes xsave avx f16c rdrand hyp ervisor lahf_lm abm cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpr iority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt arat umip md_clear arch_capabilities
Linux debian 5.10.0-23-amd64 #1 SMP Debian 5.10.179-1 (2023-05-12) x86_64 GNU/Linux
Brave: 1.52.126 Chrome: 114.0.5735.199 Firefox: 113.0.1 BUT ERROR IS NOW: Error while submitting prompt: SyntaxError: JSON.parse: unexpected character at line 1 column 1 of the JSON data
Failure Information (for bugs)
Server: Generate: The response could not be sent, maybe connection was terminated? Client: Error while submitting prompt: SyntaxError: Unexpected token '<', "DOCTYPE"... is not a valid JSON
Steps to Reproduce
Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.
Failure Logs
username@debian:/media/username/Storage/koboldcpp-1.33$ python3 koboldcpp.py pygmalion-13b-superhot-8k.ggmlv3.q4_K_M.bin 6969 --contextsize 4096 Welcome to KoboldCpp - Version 1.33 Attempting to use OpenBLAS library for faster prompt ingestion. A compatible libopenblas will be required. Initializing dynamic library: koboldcpp_openblas.so
Loading model: /media/username/Storage/koboldcpp-1.33/pygmalion-13b-superhot-8k.ggmlv3.q4_K_M.bin [Threads: 11, BlasThreads: 11, SmartContext: False]
Identified as LLAMA model: (ver 5) Attempting to Load...
System Info: AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 | llama.cpp: loading model from /media/username/Storage/koboldcpp-1.33/pygmalion-13b-superhot-8k.ggmlv3.q4_K_M.bin llama_model_load_internal: format = ggjt v3 (latest) llama_model_load_internal: n_vocab = 32000 llama_model_load_internal: n_ctx = 4096 llama_model_load_internal: n_embd = 5120 llama_model_load_internal: n_mult = 256 llama_model_load_internal: n_head = 40 llama_model_load_internal: n_layer = 40 llama_model_load_internal: n_rot = 128 llama_model_load_internal: ftype = 15 (mostly Q4_K - Medium) llama_model_load_internal: n_ff = 13824 llama_model_load_internal: model size = 13B llama_model_load_internal: ggml ctx size = 0.09 MB llama_model_load_internal: mem required = 10572.94 MB (+ 1608.00 MB per state) llama_new_context_with_model: kv self size = 3200.00 MB Load Model OK: True Embedded Kobold Lite loaded. Starting Kobold HTTP Server on port 6969 Please connect to custom endpoint at http://localhost:6969