SciSharp / LLamaSharp

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.
https://scisharp.github.io/LLamaSharp
MIT License
2.56k stars 334 forks source link

Godot game engine example #608

Open adammikulis opened 6 months ago

adammikulis commented 6 months ago

I have put together a plugin for the Godot game engine that uses LLamaSharp to load a model and interact directly within the editor. I'd like to submit it as a documentation example, but it could probably use some polishing first.

Currently I'm having an issue figuring out how to automatically add the NuGet packages like the Unity example does (it looks like Unity has NuGetForUnity but there is no Godot equivalent). If anyone knows how I can do this so that the user doesn't have to open Visual Studio and manage the solution that would be great. Please let me know what other changes/improvements I should make before it could be listed in the README, thanks.

AsakusaRinne commented 6 months ago

@eublefar @Xsanf Maybe you'll be interested in this issue. I hope this mention didn't disturb you. :)

eublefar commented 6 months ago

I am not familiar with Godot, but maybe you could just download all NuGet dependencies and add them to the plugin manually? You can just click Download package on nuget page, rename it's extension to zip and extract the dlls from there. I am not sure if it will cause assembly conflicts tho.

adammikulis commented 6 months ago

Thanks @eublefar, I was wondering the best way to extract those files (aside from going through an existing build). I've added them to the plugin folder and will try to manually reference them in the project. I'm wondering if PR #603 (SetDllImportResolver based loading) will help with this, as the end-user of the plugin may not always have an Nvidia gpu. It seems like either way the user will need a .csproj file, making this more suitable as a template but unable to be added to the Godot AssetLib as a typical plugin.

eublefar commented 6 months ago

end-user of the plugin may not always have an Nvidia gpu

You can add CPU backend by default and add some comments where to get CUDA backend.

AsakusaRinne commented 6 months ago

end-user of the plugin may not always have an Nvidia gpu

@adammikulis Maybe you could include both cpu and cuda native libraries and use LLama.Native.NativeLibraryConfig to select the backend dynamically at the beginning of the program. Note that installing both cpu and cuda backends is supported in master branch but not v0.10.0. If you want to use v0.10.0, please keep those files manually.

adammikulis commented 4 months ago

Thank you for the help! I've made some significant updates to my Godot add-on that bring it closer to being a useful example project. At this point I'm just trying to resolve the issue of properly referring to my backends in my .csproj.

I was able to successfully refer to a locally downloaded LLamaSharp.dll by adding a path reference to my .csproj (within an ItemGroup) but I am unable to get the Assemblies to recognize the Cuda backend I've manually added. Am I using the correct name for it? I also tried LLama and LLama.Native.

`

addons\mind_game\libs\runtimes\win-x64\native\cuda12\llama.dll True

`

image

I am calling NativeLibraryConfig.Instance.WithCuda(); as soon as the project scene loads, so I don't think that's the problem.

AsakusaRinne commented 4 months ago

runtimes\win-x64\native\cuda12\llama.dll is the correct path, but I'm not sure whether its prefix is correct. Since you've already had an specific path of native library, you can also use NativeLibraryConfig.Instance.WithLibrary.

BTW, it might be a stupid question but may I ask whether .NET runtime higher than coreapp 3.1 could be used in godot?

adammikulis commented 4 months ago

For some reason I cannot get Godot to use my GPU if the CPU backend is installed, so I'll omit that backend for now and am following #670. As I am distributing Mind Game as an engine add-on, many people who use it will be well-versed with Godot but possibly not so much with NuGet. The ability to dynamically download the correct backend will go a long way towards its usability both as a technical tool and as part of a finished game.

Godot 4.3 dev6 is .NET8-compatible, but they're still working on getting mobile exports to work. Web export will take additional time, but I should be able to get LLamaSharp working on basically every device via this addon.

I have recently updated the addon to save and autoload model/inference user configurations, so with some review it may be ready to be an example project.

AsakusaRinne commented 4 months ago

@adammikulis Hi, are you using .NET standard2.0?

adammikulis commented 4 months ago

Hi @AsakusaRinne, I am using .NET 8.0 for this project, should I use a lower version? I'm looking to have AOT compile so this runs well on mobile devices, and I don't think .NET standard 2.0 supports that. I think I'll just try out the experimental backend downloading when it's available, as I'm also looking to integrate downloading of the models themselves into the add-on (it seems like that's another feature you all are working on). This honestly isn't an issue yet since I don't think anyone has used this add-on, or if they did they must have had an Nvidia GPU since nobody has reported an issue.

AsakusaRinne commented 4 months ago

should I use a lower version?

Nope, .NET 8 is better. I don't know much about Godot so I thought it's the same as unity, which only supports .NET standard2.0. :) Currently all the releases of LLamaSharp only supports dynamic native library loading with .NET6 or higher. The support for .NET standard2.0 will be added in #738.


I'll merge #688 in 2 days (with some documentation added), and then move on in #692.

I'd like to hear from you about the security concern of the backend-downloading feature. As mentioned in this comment, in some cases using this feature can lead to a security problem. After having #688 merged, it's possible to put the DLLs in your own remote resource and define the downloading process yourself. However I'm not sure whether this reduces the risk to the point where it can be used for game release.

adammikulis commented 4 months ago

Unity is known for its C# support, but ever since their runtime fee debacle there has been a lot of development with Godot to support C#/.NET better. Mobile export support should be available in the next few months, but HTML5 doesn't have a timeline.

I can't really argue with the security concern mentioned, despite the convenience that the auto-download would be. I think with LLamaSharp geared towards enterprise environments (or at least moreso than Python libraries), @dluc's point about the possible backdoor makes me wonder about the pushback such a feature could receive.

For my project, either I'm doing something wrong when trying to use NativeLibraryConfig.Instance.WithCuda(true); or there is some sort of backend loading that happens in Godot before the config is run (I have it as the first line executed in the first script that is autoloaded when the engine starts, and I'm not using any LLamaSharp methods/vars before calling that line). Let me know if you're able to test the current release (v0.2), or even the current dev branch if you're interested in playing around with Godot.

AsakusaRinne commented 4 months ago

I think it's a LLamaSharp issue rather than your fault. For example, if the godot output folder has a different directory structure with common .NET apps, it may lead to such an error.

Would you like to give an out-of-the -box example to reproduce this error? Since I know nothing about godot, it will save me a lot of time if I have such an example, without the need to configure the project on my own. I'll install godot tonight to take a look into it.

adammikulis commented 4 months ago

This is from my most recent commit, you open it by starting Godot and using the Import menu feature to find the folder and the project.godot file. MindGame-0.3-dev.zip

If you click the play icon in the upper right of the engine, it should start you with a 3D chat scene. There you can configure and load your model, as well as access the menu for inference configuration. The UI isn't very good and I need to separate model/inference config from the loading, but it should work for testing.

I'm not sure what OS you're using but Godot 4.3 dev6 Mono will work with it. If you want to try exporting this project as a game, you can download the export templates within the engine (but this isn't needed for just running the project).

It's just a menu click away from configuring Godot to work with Visual Studio, here is a quick overview of the basics of C# with Godot.

To select an external editor in Godot, click on Editor → Editor Settings and scroll down to Dotnet. Under Dotnet, click on Editor, and select your external editor of choice.

Let me know if you have any trouble running the project or with Godot in general. The main script is MindManager.cs, which is autoloaded by the engine. Through that, it gets added as a node to every scene, allowing all scripts to access the model for inference (via MindAgents).

AsakusaRinne commented 4 months ago

I downloaded the zip file but got an error below. I'm using Godot 4.2, does that matter?

image

adammikulis commented 4 months ago

It's very likely that there has been a change in the AnimationPlayer between 4.2 and 4.3 (all of my development on Mind Game has been with 4.3 dev versions, because 4.2 does not support mobile C# export). Do you get the same error with Godot 4.3 dev6 Mono?

AsakusaRinne commented 4 months ago

Yeah it's a Godot version problem. I've known why this behavior happens now and will try to fix it.

AsakusaRinne commented 4 months ago

Sorry my previous reply was wrong. Actually I cannot reproduce this error. The game runs and uses GPU during the inference.

Could you please take the following steps to check?

  1. Run nvidia-smi in your powershell.
  2. Run nvcc -V in your powershell.
  3. Run dumpbin /dependents llama.dll in the "Developer Command Prompt for Visual Studio", in which llama.dll should be a cuda-compiled one.

I think the reason is likely to be that llama.dll cannot find a cuda/cublas library as its dependency. If you do not have cuda installed, you can try to download cudart files to the same folder with llama.dll.

image

p.s. When you're using the auto-download feature, did this problem go away?

adammikulis commented 4 months ago

I'm glad that it uses the GPU even when both backends are installed! The UI needs some work but it shouldn't be too hard to fix that. Thanks for downloading Godot and trying this project out :)

I'm wondering if my Cuda version is too new? 1) nvidia-smi image

2) nvcc -V image

3) I ran dumpbin /dependents llama.dll in my project repo but I haven't actually been using a direct reference to llama.dll (switched back to Nuget packages) so maybe that's why I get this error: image

I haven't tried the auto-download yet since I figured that the issue may be on my end. I'm going to downgrade to Cuda 12.2 and see if that resolves the issue of GPU usage with both backends installed.

AsakusaRinne commented 4 months ago

Would your project use use GPU if you use the cuda11 backend? The cuda version displayed in your nvidia-smi is 12.4, but it's 11.6 in nvcc -V. Generally the output of nvcc is accurate.

adammikulis commented 4 months ago

I'm a little surprised that my versions were different between the two commands (I would have assumed that when I installed Cuda 12.4 it would have updated build tools from 11.6). Upon installing Cuda 12.2 they are now both reporting as such and the program functions normally with both backends (CPU and Cuda12) installed. This was my mistake somehow when I installed Cuda 12.x and didn't entirely overwrite an old install of 11.6. Not a LLamaSharp or Mind Game bug, just some past user error :)

Since it appears to be functioning correctly, what changes do I need to make before it could be a proper example? I think I need to add more comments/documentation, separate config UI from loading UI, and finalize the config autoloading that I've programmed in.

AsakusaRinne commented 4 months ago

If you installed cuda 12.4 after having cuda 11.6 installed, you need to change the PATH environment variables (and any other variables related with cuda).

Since it appears to be functioning correctly, what changes do I need to make before it could be a proper example?

Maybe the state saving&loading since they are common in games? :)

adammikulis commented 4 months ago

I was thinking of transitioning to the BatchedExecutor before adding in conversation save/loading but wanted your thoughts. As I want this to be able to be used for NPCs in games, being able to process a lot of conversations at once would be very useful for a town simulation. Additionally, the conversation forking allows for a Detroit: Become Human scenario with many branching save points for the user. But from what I can tell, LLaVa is only possible with the InteractiveExecutor, so giving NPCs "sight" means a trade-off on parallelism.

If I were to want to have many independent agents running around and talking with each other (each with their own backstory accessed via a GraphRAG system I'm building), would the BatchedExecutor be the correct way of doing this?

martindevans commented 4 months ago

The BatchedExecutor is something I've been developing for a few months, it's intended as a new "low-ish level" foundation which future higher level executors can be built on. It's the most powerful option available, the only way to do batch processing and shared KV cache (i.e. forking), but it's not very user friendly!

You feed in tokens and it gives you back logits, that's all! Therefore you have to handle:

There are things available in LLamaSharp to do all of these things except for scheduling, but for now if you want to use the BatchedExecutor it'll be up to you to assemble those puzzle pieces together into a useful system.

You're right that LLava isn't supported with the BatchedExecutor right now, but it is something I'd like to add support for. I think it's the last major thing that can't be done with it (although I'm not sure how useful llava would be for a game, it's very slow).

adammikulis commented 4 months ago

There are things available in LLamaSharp to do all of these things except for scheduling, but for now if you want to use the BatchedExecutor it'll be up to you to assemble those puzzle pieces together into a useful system.

Thanks for the input! Scheduling would be pretty straightforward, I can have a Timer node in Godot trigger inference (or have it collision-based if I give the agents a conversation radius). The plan is to turn this into a complete game (in addition to making the base functionality available as an add-on), so I am taking a long-term view to match it with LLamaSharp's development.

You're right that LLava isn't supported with the BatchedExecutor right now, but it is something I'd like to add support for. I think it's the last major thing that can't be done with it (although I'm not sure how useful llava would be for a game, it's very slow).

I can't think of anything I personally would need LLaVa for in the game I'm making, so I'll get comfortable with the BatchedExecutor (especially if that could be added as a feature later). With the game I'm making, the majority of LLM interaction will be NPC-to-NPC rather than User-to-NPC, so having parallel/batched conversations will make things much easier.