mudler / LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed, P2P inference
https://localai.io
MIT License
26.28k stars 1.97k forks source link

Proposal: Migrate gRPC / protobuf definition files to their own repo #4012

Open dave-gray101 opened 3 weeks ago

dave-gray101 commented 3 weeks ago

Part of the CI Speedup Project https://github.com/users/dave-gray101/projects/2/views/1

In order to really accelerate CI, I want to build backends only when a relevant change is made to them. The first step in achieving this is to make it as simple as possible to build backends outside of the main source tree.

I propose that the .proto file(s) be moved to a separate repo so that:

  1. It's faster and easier for subprojects to get just the files they need
  2. We can investigate building and publishing additional generated packages to remove the grpc / protobuf compiler dependencies

While this happens, I'd like to also consider making breaking changes to the specification to rename the really generic names and provide a better service name than "backend" - that way, once the proto project is "ready", we should not need to make any breaking changes for a long time as it is relatively rare that we break an existing field rather than add a new one.

dave-gray101 commented 3 weeks ago

I also want to investigate https://buf.build/docs/ at this time - might help

mudler commented 3 weeks ago

I have a bit of mixed feeling in this, my thoughts inline:

Part of the CI Speedup Project https://github.com/users/dave-gray101/projects/2/views/1

In order to really accelerate CI, I want to build backends only when a relevant change is made to them. The first step in achieving this is to make it as simple as possible to build backends outside of the main source tree.

I think to some extents it is related to #3953

I propose that the .proto file(s) be moved to a separate repo so that:

1. It's faster and easier for subprojects to get just the files they need

I can't really grasp why this would be more complex: subprojects could pull the proto file from the LocalAI repository, no? what's the benefit of having another repository containing only the proto file?

I guess it goes also back to the point of monorepo vs multiple repo: with time I started to tend liking more monorepo with just the necessary re-usable code split around. Adding a multitude of split repo adds a complexity layer in both operation, maintenance (multiple tags to take care of, CI, etc.), and understanding while also raising the entry level for contributors.

2. We can investigate building and publishing additional generated packages to remove the grpc / protobuf compiler dependencies

That'd be really nice. Just keep in mind that we can't get away much as we still have the grpc client to take care of

dave-gray101 commented 3 weeks ago

My goal is similar to #3953 - but I also want to be experiment with federated swarms with different types of workers. While I'm messing around I was thinking it might be easier if backends could be more easily built and versioned separately from the core service.

Regarding point 2: I believe that buf can be configured to generate the grpc client / server code as well.

I'm going to create a localai-proto repo to do some experimentation around this to see if it will work for us. My theory around a separate repo primarily is conceptual: "a backend shouldn't need to depend on the core, just the protobuf" -- and it will make it easier for me to mess around. Once everything is figured out, it may make sense to backport things to the main repo, if it doesn't turn out to be useful - but in my theoretical world, localai and every backend wouldn't need to rebuild every time llama.cpp releases, as the next phase of the experiment would be building the "heavier" backends like llamacpp in seperate repos entirely.

I agree that there would need to be some CI work done to keep all the versions "in sync" - but in general my theory is that the backends shouldn't "need" to be in sync unless we push a breaking change.

dave-gray101 commented 3 weeks ago

So, I'll need to do some more experiments, but buf is a pretty neat tool. I've intentionally sat this on an alternative namespace (localai-v1) while I do some preliminary testing, but take a look at https://buf.build/dave-gray101/localai-v1 when you get a chance - in addition to the tool being able to manage the grpc / protobuf tooling on CI actions automatically, we can also use their registry to explore "protobuf documentation" somewhat for free. If we're willing to depend on their infra (idk) we can even directly use their sdk generator.

I haven't tried to actually use the generated code yet - I likely haven't worked out all the kinks yet, this is more a preliminary statement that "buf is cool"