V-Sekai / godot-llama

MIT License
6 stars 0 forks source link

Goal: functional godot_llama #2

Open fire opened 10 months ago

fire commented 10 months ago
fire commented 10 months ago

@aiaimimi0920 You may be interested, I am not able to work on this much.

aiaimimi0920 commented 10 months ago

@fire Indeed, for the sake of my AI robot, I have attempted to implement a plugin using llama.cpp(https://github.com/ggerganov/llama.cpp).

But the problem with the local llama.cpp is:

  1. If you use a small model, such as 7b, then the generated response you receive is basically unsatisfactory,
  2. If a large model is used, it will result in slow inference time on most machines. My machine is 2080ti 22g, but it still cannot achieve my satisfactory inference speed

So I ultimately turned to these two options

  1. Server deployment of llama+API calls
  2. Call the interface of the network platform, similar to https://github.com/xtekky/gpt4free

If you are still interested in using llama for deployment in Godot, I may be able to provide some assistance,

But I am currently trying to replace Godot's tts using (https://github.com/huakunyang/SummerTTS) Compiled into gdextension, I hope my mimi ai can speak emotional voices

After finishing this, I will come to help you

fire commented 10 months ago

My plan before I decided to focus on V-Sekai was to evaluate performance on https://huggingface.co/TheBloke/Nous-Hermes-2-SOLAR-10.7B-GGUF

aiaimimi0920 commented 10 months ago

I haven't tested this model yet (https://huggingface.co/TheBloke/Nous-Hermes-2-SOLAR-10.7B-GGUF), it seems to be a new one,

You can try running this model using llama.cpp, Llama.cpp is easy to deploy and run

I believe the key reference criteria for using this model are:

  1. Machine configurations that most players can have
  2. Reasonable response time
  3. A reasonable AI response

I am happy to help you complete this project, as it is also what I need

fire commented 10 months ago

https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md can be used to expose json grammar so like create an animation tree for an npc travel commands

fire commented 10 months ago

With sufficiently long context like with RWKV, one can also use https://github.com/lucidrains/meshgpt-pytorch to generate godot engine resources and like packed scenes.

Combining mesh generation with gbnf it may even be always valid.

aiaimimi0920 commented 10 months ago

How about it? Do you have performance test reports or anything like that

I have roughly completed the compilation steps for tts (https://github.com/huakunyang/SummerTTS/issues/35),

and I should be able to help you soon

fire commented 10 months ago

I tried using the solar model in lm studio and it was impressive until the context size was exceeded and it stopped working. I believe that is normal. There are evaluation metric charts in the hugging face but its not the same as using it. When gpu offloaded the performance had a instant feeling in lm studio which also uses llama.cpp. I believe rwkv is a llama.cpp fork for unlimited context

aiaimimi0920 commented 10 months ago

Amazing, I will also test it. If possible, let's bring it to Godot

fire commented 10 months ago

I think the most interest use case is generating Godot Engine Resource ".tres" directly.

The tscn example doesn't work well, but the json one works.

  1. https://github.com/V-Sekai-fire/meshgpt-dataset-01
fire commented 10 months ago

Here's a photo.

image

MichaelrMentele commented 9 months ago

So, is the current approach still to run this against a remote/hosted model? I was really interested in running models locally, sure they may be big and slowish at the moment but they will get smaller over time! Also curious -- does this not overlap with V-Sekai/iree.gd ?

fire commented 9 months ago

It does overlap. Iree.gd is functional, this isn’t.

fire commented 9 months ago

I'll probably archive godot-llama if there's no work done on it in the next weeks.

fire commented 4 months ago

I revived godot-llama because of exceptional performance from PHI-3. https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF

fire commented 4 months ago

Updated the llama-cpp library but ran out of time to update godot-llama. Looking for help.