ggerganov / llama.cpp

LLM inference in C/C++
MIT License
67.96k stars 9.75k forks source link

Use-after-free in llama_build_graph #10363

Closed JohannesGaessler closed 2 hours ago

JohannesGaessler commented 2 hours ago

In llama_build_graph:

  1. An instance of llm_build_context is created.
  2. The function calls llm_build_context.init() which in turn calls ggml_init and stores the ggml_context pointer as llm_build_context.ctx0.
  3. A new graph is constructed using llm_build_context.ctx0.
  4. The function calls llm_build_context.free() which in turn calls ggml_free(llm_build_context.ctx0).
  5. The function returns the created graph.

Unless I'm missing something this is a use-after-free bug since the ggml_context that the graph (and its tensors) have been allocated in is always freed before returning the graph.

slaren commented 2 hours ago

The graph and tensors do not have any references to the ggml_context, they are allocated from the buf_compute_meta which is stored in llama_context.

JohannesGaessler commented 2 hours ago

Thanks, I was missing that the buffer for the graph + tensors is allocated externally.