stateful / runme

DevOps Workflows Built with Markdown
https://runme.dev
Apache License 2.0
901 stars 30 forks source link

Support using Foyle's AI capabilities as a plugin #574

Closed jlewi closed 1 month ago

jlewi commented 1 month ago

Per the discussion in discord; we'd like to experiment with integrating RunMe with Foyle's AI capabilities. Foyle aims to be a DevOps copilot. One of the key problems its tackling is trying to train an AI copilot to be an expert in your infrastructure. Foyle relies on a notebook like experience to collect implicit human feedback which is used to train the AI (blog post). Like RunMe, Foyle is also built using VSCode Notebooks. RunMe is far more mature and has more features than Foyle. So it likely makes sense to reuse RunMe as Foyle's notebook rather than rebuilding a separate stack.

Towards that end, this issue is tracking the minimal implementation to get RunMe to support calling out to Foyle and surfacing the results. The current thinking is they would be two separate extensions to start with and RunMe would use gRPC to communicate with Foyle. A more detailed design is provided in Foyle Tech Note 004.

The first step is designing a shared gRPC service to allow the services to communicate; #573

jlewi commented 1 month ago

Here's a more detailed look at what we need to do for logging

Gaps In Current RunMe Logging

Here are the current limitations of logging in RunMe for the purposes of supporting Foyle retraining

Proposed Changes to RunMe Code Base

Server Changes

vscode-runme Changes

I think the main changes we'd need to make to RunMe's vscode extension is to plumb through the server changes

@sourishkrout Could you PTAL and let me know if this looks good or if you have any suggestions before I get started on implementing the changes?

sourishkrout commented 1 month ago

@sourishkrout Could you PTAL and let me know if this looks good or if you have any suggestions before I get started on implementing the changes?

Overall this looks good to me. One thing I'm wondering 🤔 is how do we make it clear that logging is no longer exclusively for "human consumption" (troubleshooting, debugging, etc)? As far as I understand your proposal Foyle's training will be based on the logs being stable and machine-readable, no @jlewi?

jlewi commented 1 month ago

As far as I understand your proposal Foyle's training will be based on the logs being stable and machine-readable, no @jlewi?

That's correct. Although only a subset of the logs are arguably intended to be machine readable. I'd argue that one of the main benefits of adopting structured logging is that it produces logs that can be consumed by humans or by machines. Nonetheless there's some subtlty here.

The first is configuring the logger. In particular, it isn't necessarily a good UX if the console logger and the machine logger are both using the same format; for the console you might want human readable while for machine logger you may always want to use JSON. Furthermore, for machine readable you may not want the user to be able to adjust the log level because they could avoid including critical messages.

This part we can solve by using configuring separate loggers for the console and the machine logs and then routing logs to both. This is what we are already doing in Foyle.

The second part is logging and processing data so that we avoid brittleness in our pipeline. We have the following the log line

logger.Debug("received initial request", zap.Any("req", req))

Which will be critical to Foyle's learning process. This means if someone removes or edits the above log line they could break certain functionality; this is rather unexpected for most developers. I don't have good solutions for that. In principle, we could use a combination of linters and unittests to catch those breakages.

sourishkrout commented 1 month ago

đź‘Ť @jlewi. Don't see a problem in changing logging in Runme to suit the needs here.

This means if someone removes or edits the above log line they could break certain functionality; this is rather unexpected for most developers. I don't have good solutions for that. In principle, we could use a combination of linters and unittests to catch those breakages.

This is primarily what I'd like to guard against. I think a combo of unit tests and perhaps we can even name the logger.(...) something like learningLogs.(...) or something that makes developers think twice. In any case, unit tests are usually the best guard in this scenario to maintain a "contract".

sourishkrout commented 1 month ago
  • An issue here is how to log it in proto JSON format
  • The simplest thing to do is to just add code to serialize the request to JSON and then log the result
  • The alternative is to to use the proto plugin go-proto-zap-marshaler to auto generate LogMarshalObject methods for all protos

    • That only makes sense if there's going to be widespread logging of protos in JSON format. Since we only need to update a single log message right now I don't think its worth introducing it as part of this change

I suggest going the simple route first and see how far that'll get us.

jlewi commented 1 month ago

After #585 ; the only remaining change on the RunMe side is to add a vscode flag to enable Foyle which would launch the RunMe server with the flags to enable logging. That's only a blocker to being able to train off of RunMe data but using Foyle should still work even without that.

I believe Foyle should be available in RunMe vscode as soon as https://github.com/stateful/vscode-runme/pull/1356 Lands in a release. It looks like it just missed the 3.5.5 release https://github.com/stateful/vscode-runme/releases/tag/3.5.5

jlewi commented 1 month ago

It looks like there was https://github.com/stateful/vscode-runme/releases/tag/3.5.6

But that didn't include the ailogger experiment. That should be in the next release

jlewi commented 1 month ago
sourishkrout commented 1 month ago

RunMe 3.5.7 https://github.com/stateful/vscode-runme/releases/tag/3.5.7 was released earlier today

v3.5.8 was just released and includes the fix for vscode-runme#1389

jlewi commented 1 month ago

Woo Hoo! Thanks @sourishkrout