Closed iboB closed 1 month ago
Windows-only build error in ggml-cuda: Since it was small and was already done in llama, instead of finding a revision that works, I made a temporary branch of our fork and attached the submodule to it: https://github.com/alpaca-core/ggml/compare/master...alpaca-core:ggml:tmp-fix-cuda
llama_reset_timings
is gone. Comment out our uses and fork to a different issue
whisper.cpp missing newly added argument to ggml_graph_plan
. nullptr seems safe. Instead of redirecting all submodules, make a temporary branch to our fork and attach the submodule to it:
sampling logic in llama is completely changed. Our sampler doesn't compile. Working on this
min_keep
is documented and optional
We really need some resolution on #33 ... until then adding issues manually.
We should do this soon and again in the second half of October