V-Sekai / godot-llama

MIT License
5 stars 0 forks source link

Goal: functional godot_llama #2

Open fire opened 8 months ago

fire commented 8 months ago
fire commented 8 months ago

@aiaimimi0920 You may be interested, I am not able to work on this much.

aiaimimi0920 commented 8 months ago

@fire Indeed, for the sake of my AI robot, I have attempted to implement a plugin using llama.cpp(https://github.com/ggerganov/llama.cpp).

But the problem with the local llama.cpp is:

  1. If you use a small model, such as 7b, then the generated response you receive is basically unsatisfactory,
  2. If a large model is used, it will result in slow inference time on most machines. My machine is 2080ti 22g, but it still cannot achieve my satisfactory inference speed

So I ultimately turned to these two options

  1. Server deployment of llama+API calls
  2. Call the interface of the network platform, similar to https://github.com/xtekky/gpt4free

If you are still interested in using llama for deployment in Godot, I may be able to provide some assistance,

But I am currently trying to replace Godot's tts using (https://github.com/huakunyang/SummerTTS) Compiled into gdextension, I hope my mimi ai can speak emotional voices

After finishing this, I will come to help you

fire commented 8 months ago

My plan before I decided to focus on V-Sekai was to evaluate performance on https://huggingface.co/TheBloke/Nous-Hermes-2-SOLAR-10.7B-GGUF

aiaimimi0920 commented 8 months ago

I haven't tested this model yet (https://huggingface.co/TheBloke/Nous-Hermes-2-SOLAR-10.7B-GGUF), it seems to be a new one,

You can try running this model using llama.cpp, Llama.cpp is easy to deploy and run

I believe the key reference criteria for using this model are:

  1. Machine configurations that most players can have
  2. Reasonable response time
  3. A reasonable AI response

I am happy to help you complete this project, as it is also what I need

fire commented 8 months ago

https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md can be used to expose json grammar so like create an animation tree for an npc travel commands

fire commented 8 months ago

With sufficiently long context like with RWKV, one can also use https://github.com/lucidrains/meshgpt-pytorch to generate godot engine resources and like packed scenes.

Combining mesh generation with gbnf it may even be always valid.

aiaimimi0920 commented 8 months ago

How about it? Do you have performance test reports or anything like that

I have roughly completed the compilation steps for tts (https://github.com/huakunyang/SummerTTS/issues/35),

and I should be able to help you soon

fire commented 8 months ago

I tried using the solar model in lm studio and it was impressive until the context size was exceeded and it stopped working. I believe that is normal. There are evaluation metric charts in the hugging face but its not the same as using it. When gpu offloaded the performance had a instant feeling in lm studio which also uses llama.cpp. I believe rwkv is a llama.cpp fork for unlimited context

aiaimimi0920 commented 8 months ago

Amazing, I will also test it. If possible, let's bring it to Godot

fire commented 8 months ago

I think the most interest use case is generating Godot Engine Resource ".tres" directly.

The tscn example doesn't work well, but the json one works.

  1. https://github.com/V-Sekai-fire/meshgpt-dataset-01
fire commented 8 months ago

Here's a photo.

image

MichaelrMentele commented 7 months ago

So, is the current approach still to run this against a remote/hosted model? I was really interested in running models locally, sure they may be big and slowish at the moment but they will get smaller over time! Also curious -- does this not overlap with V-Sekai/iree.gd ?

fire commented 7 months ago

It does overlap. Iree.gd is functional, this isn’t.

fire commented 7 months ago

I'll probably archive godot-llama if there's no work done on it in the next weeks.

fire commented 2 months ago

I revived godot-llama because of exceptional performance from PHI-3. https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF

fire commented 2 months ago

Updated the llama-cpp library but ran out of time to update godot-llama. Looking for help.