When running newer versions (from 3.3.0 higher) with any model, the JVM crashes.:

Extracted 'ggml.dll' to 'C:\Users\user\AppData\Local\Temp\ggml.dll' Extracted 'llama.dll' to 'C:\Users\user\AppData\Local\Temp\llama.dll' Extracted 'jllama.dll' to 'C:\Users\user\AppData\Local\Temp\jllama.dll'

You are an AI assistant who gives a quality response to whatever users ask of you. Problem: Write a short poem about deep blue sea. #

A fatal error has been detected by the Java Runtime Environment:

EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x00007ffdbd4a2f58, pid=25224, tid=18764

JRE version: Java(TM) SE Runtime Environment (23.0+37) (build 23+37-2369)

Java VM: Java HotSpot(TM) 64-Bit Server VM (23+37-2369, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, windows-amd64)

Problematic frame:

C [msvcp140.dll+0x12f58]

No core dump will be written. Minidumps are not enabled by default on client versions of Windows

An error report file with more information is saved as:

C:\eclipse\workspace\llama\hs_err_pid25224.log

[21.786s][warning][os] Loading hsdis library failed

Using.: latest visual studio redistributable 2022 for this platform

jar used was downloaded via maven (https://jar-download.com/artifacts/de.kherud/llama). Pprevious versions up to llama-3.2.1.jar worked without error from the same source.

I'm also having the same Issue. My project is set up in the following way.

Filestructure

java-llamacpp-examples/
├─ models/
│  ├─ mistral-7b-instruct-v0.2.Q2_K.gguf
├─ src/
│  ├─ main/
│  │  ├─ java/
│  │  │  ├─ org.example/
│  │  │  │  ├─ Example.java
├─ pom.xml

Maven dependency

I have added the dependency to my pom.xml file.

<dependency>
    <groupId>de.kherud</groupId>
    <artifactId>llama</artifactId>
    <version>3.4.1</version>
</dependency>

Code

Example.java contains the following code.

package org.example;

import de.kherud.llama.InferenceParameters;
import de.kherud.llama.LlamaModel;
import de.kherud.llama.LlamaOutput;
import de.kherud.llama.ModelParameters;
import de.kherud.llama.args.MiroStat;

public class Example {

    public static void main(String... args) {
        ModelParameters modelParams = new ModelParameters()
                .setModelFilePath("models/mistral-7b-instruct-v0.2.Q2_K.gguf")
                .setNGpuLayers(43);

        String system = "This is a conversation between User and Llama, a friendly chatbot.\n" +
                "Llama is helpful, kind, honest, good at writing, and never fails to answer any " +
                "requests immediately and with precision.\n";

        try (LlamaModel model = new LlamaModel(modelParams)) {
            System.out.print(system);
            String prompt = system;

            prompt += "\nUser: Why is the sky blue?";
            prompt += "\nLlama: ";

            InferenceParameters inferParams = new InferenceParameters(prompt)
                    .setTemperature(0.7f)
                    .setPenalizeNl(true)
                    .setMiroStat(MiroStat.V2)
                    .setStopStrings("\n");

            for (LlamaOutput output : model.generate(inferParams)) {
                System.out.print(output);
            }
        }
    }
}

Error

When I try to run Example.java I get the following output

Extracted 'ggml.dll' to 'C:\Users\<username>\AppData\Local\Temp\ggml.dll'
Extracted 'llama.dll' to 'C:\Users\<username>\AppData\Local\Temp\llama.dll'
Extracted 'jllama.dll' to 'C:\Users\<username>\AppData\Local\Temp\jllama.dll'
[WARN] Not compiled with GPU offload support, --n-gpu-layers option will be ignored. See main README.md for information on enabling GPU BLAS support n_gpu_layers=-1
[INFO] build info build=3534 commit="641f5dd2"
[INFO] system info n_threads=4 n_threads_batch=-1 total_threads=8 system_info="AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 | "
llama_model_loader: loaded meta data with 24 key-value pairs and 291 tensors from models/mistral-7b-instruct-v0.2.Q2_K.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
.
.
<more model data>
.
.
llama_new_context_with_model: graph splits = 1
[INFO] initializing slots n_slots=1
[INFO] new slot id_slot=0 n_ctx_slot=32768
[INFO] model loaded
[INFO] chat template chat_example="[INST] You are a helpful assistant\nHello [/INST]Hi there</s>[INST] How are you? [/INST]" built_in=true
This is a conversation between User and Llama, a friendly chatbot.
Llama is helpful, kind, honest, good at writing, and never fails to answer any requests immediately and with precision.

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x00007ff9caba2f58, pid=10252, tid=12776
#
# JRE version: Java(TM) SE Runtime Environment (23.0+37) (build 23+37-2369)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (23+37-2369, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, windows-amd64)
# Problematic frame:
# C  [msvcp140.dll+0x12f58]
#
# No core dump will be written. Minidumps are not enabled by default on client versions of Windows
#
# An error report file with more information is saved as:
# C:\Users\<username>\Documents\<path-to-project>\java-llamacpp-examples\hs_err_pid10252.log
[1.981s][warning][os] Loading hsdis library failed
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

kherud / java-llama.cpp

Exception Access Violation in Windows x86-64 when running in Windows 11 and default libraries (in release) #83