ubiquity-os / ubiquity-os-kernel

1 stars 19 forks source link

Configuration validation and annotations on errors #80

Closed gentlementlegen closed 3 weeks ago

gentlementlegen commented 3 months ago

In the bot v1, when the configuration is changed, annotations are added to the configuration file if any error is encountered within the configuration. It would be really nice to have for v2 as it is a common scenario to have an invalid configuration and have no feedback about it because the error would only be visible within the worker logs.

The main challenge is that only plugins are aware of their configuration shape, so probably the kernel should call every plugin to check their validity, which would require an endpoint or some access to the configuration validators. At the same time, the manifest.json validity could be checked, relating to https://github.com/ubiquity/ubiquibot-kernel/issues/78.

Since this might imply quite a few network requests, we can also consider having this functionality as a separate Worker using service binding.

0x4007 commented 3 months ago

Since this might imply quite a few network requests

I think its fine given that we can make 5000 requests per hour, per organization, before we get rate limited as an app. Lets keep it simple and not do the separate Worker. Config changes do not happen regularly.

gentlementlegen commented 3 months ago

@0x4007 What you say is valid for the GitHub API. But Workers are limited to 50 requests per run / instance.

0x4007 commented 3 months ago

We can do this in a plugin instead?

gentlementlegen commented 3 months ago

I thought about it, but it means that the plugin should have access to the private repo containing the configuration and should be able to read / write which seems dangerous. Or maybe we can run that plugin within the configuration repo itself. But if we do so we cannot handle per repo based configurations I think

0x4007 commented 3 months ago

Or maybe we can run that plugin within the configuration repo itself.

Cool idea

But if we do so we cannot handle per repo based configurations I think

Research will reveal the answer!

gentlementlegen commented 2 months ago

/start

ubiquity-os[bot] commented 2 months ago
Warning! This task was created over 33 days ago. Please confirm that this issue specification is accurate before starting.
DeadlineThu, Sep 5, 6:39 AM UTC
Registered Wallet 0x0fC1b909ba9265A846b82CF4CE352fc3e7EeB2ED
Tips:
<ul>
<li>Use <code>/wallet 0x0000...0000</code> if you want to update your registered payment wallet address.</li>
<li>Be sure to open a draft pull request as soon as possible to communicate updates on your progress.</li>
<li>Be sure to provide timely updates to us when requested, or you will be automatically unassigned from the task.</li>
<ul>
gentlementlegen commented 2 months ago

What should be done beyond the kernel side changes: Each plugin should have an entry point for validation on Workers, and an Action script for Actions.

0x4007 commented 2 months ago

Each plugin should have an entry point for validation on Workers, and an Action script for Actions.

Can you elaborate this isn't clear to others

gentlementlegen commented 2 months ago

The kernel workflow should be the following:

  1. The kernel receives a push event
  2. The kernel figures out it is the configuration file
  3. The kernel will, for each plugin inside the configuration, poke and endpoint for Workers and an Action for Actions that will parse the configuration and return validation errors
  4. I think the kernel should also put a warning on endpoints it cannot reach and ignore the plugin without breaking the whole configuration
  5. The kernel will eventually post annotation if any issue is found, or a message saying the configuration is valid

The v1 had a similar workflow. I think this is important to have this functionality specially for newcomers that could be confused about the configuration, or avoid breaking it during development and updates.

gentlementlegen commented 2 months ago

@0x4007 I've seen you mentioning an issue about the base multiplier having changed for v2, and indeed it was reset to 1. Put it back to 3 as it was in v1.

gentlementlegen commented 1 month ago

@0x4007 Would something similar to the following format be satisfactory? It gives the error and link the corresponding line with a preview:

image
0x4007 commented 1 month ago

Not sure let's try. Make sure it comments on the commit AND tags the author to urgently notify them that there is something wrong.

gentlementlegen commented 1 month ago

Yes the comment above was a comment on the commit. Example test run: https://github.com/Meniole/ubiquibot-config/commit/f64b4023cbff5b45b4ae39ab0fb1fadb30db83d6#commitcomment-146939444

0x4007 commented 1 month ago

Also use the

[!CAUTION]

my message

syntax

The red one whatever the syntax is

gentlementlegen commented 1 month ago

Progress update: I finally got the Action plugins to validate themselves, was trickier because everything is asychronous. So I had a design question: since I have to listen for Actions to finish validating, currently what happens is that each plugin will add its own message, like so:

image

Maybe that would be too noisy since a user would get tag each time a plugin is done validating. So maybe what we can consider:

Downsize of reducing noise by editing the message is that every plugin will be async and maybe errors keep coming after the user thought the configuration was valid. What do you think? Another run example: https://github.com/Meniole/ubiquibot-config/commit/3e9152e3eadd98ef20b10bfba7529533ed392c46

0x4007 commented 1 month ago

Yes do both of your suggestions

0x4007 commented 1 month ago

Oh I thought the validator is one and done inside the kernel. This approach across every plugin seems wrong.

This can be dynamic by reading the plugin ajv validator files

gentlementlegen commented 1 month ago

@0x4007 I think it makes more sense to have it within the plugin itself, we could even have it within our SDK for simplicity. I do not see how the kernel can understand plugin configurations.

0x4007 commented 1 month ago

My idea was to import and run the ajv validation code. Could be risky but maybe there's a way to quarantine it in a way that's not too complex.

gentlementlegen commented 1 month ago

We do not use ajv anywhere, because ajv cannot run within workers. Beyond being very risky and prone to code injection, some plugins have very complex configurations spread across multiple files, like conversation-rewards how could we handle this? Some other plugins have runtime encode / decode, or types generated through import of third party libraries (for example GitHub roles that are generated from an enum). That seemed way too complex to just read the configuration file, it is much simpler to reuse the code already implemented within the plugin itself, since every plugin already validates its own payload.

Also, in the case of workers, the kernel does not know which repository they refer too, so we should serve the file as plain text on the endpoint which doesn't seem elegant.

Another advantage of using directly the plugin is that currently it also detects invalid GitHub / Worker environment variables which is very helpful.

0x4007 commented 1 month ago

I think we need to figure this out eventually because of the marketplace/plugin installer feature

gentlementlegen commented 1 month ago

I think it is quite straightforward for Workers, too slow for Actions. Maybe eventually we will need to have an endpoint for all the plugins. I don't see how we can have the kernel itself handle this because:

If we can solve all of these then the kernel should be able to just import the configuration type files and read them.

0x4007 commented 1 month ago

My vision is not too different from a Docker-like approach. We can spin up a virtual shell as a child process, which then runs its own node.js instance.

I think it is quite straightforward for Workers, too slow for Actions. Maybe eventually we will need to have an endpoint for all the plugins. I don't see how we can have the kernel itself handle this because:

  • the kernel should parse the included files and retrieve all of them

I think we just need to send everything, although I'm not fully understanding the context of this statement.

  • in case some external library is linked, the kernel should install it

As in, npm? Knip might be able to help compile these, or there may be some better tools.

  • that open the possibility to inject code, and extremely easy to break as well

Code injection might be acceptable within a virtual shell.

  • we cannot safely read the environment of plugins either, because we would leak secrets

The virtual shell should not have access to the environment secrets.

If we can solve all of these then the kernel should be able to just import the configuration type files and read them.

gentlementlegen commented 1 month ago

Workers do not run node.js but a custom cloudflare-like minimal implementation. Workers do not allow to run external code for security reasons, nor allow child processes, or shells, afaik.

I do not understand the benefit of heavily complexifying this on the kernel, when we already have running endpoints that support all the logic (and Actions are literally dockers themselves). Also I do not understand what Knip is needed for?

Practical example: I need to check if the configuration for conversation-rewards is valid. Here is the configuration file: https://github.com/ubiquibot/conversation-rewards/blob/development/src/configuration/incentives.ts

How can I check that the configuration provided is valid against this?

0x4007 commented 1 month ago

One expensive idea is to consume the type with o1-mini and then have it post a comment if it thinks that it won't work. I suppose it would unfortunately have to consume the entire plugin codebase before determining this though.

It would be a bit cheaper if we standardize the location of the payload type checker file.

This seems like a bad approach to scale. We could consider making this a standalone plugin which is expensive to run but we can allow partners to opt-out if its too expensive?


In conclusion I don't see a great cheap solution. I think we should handle it inside of the plugins as you are doing, and then in the future we automate plugin configuration using a GUI which can prevent misconfiguration? Perhaps we enforce a standard for plugin developers to follow for it to populate on the GUI and to be configurable?

gentlementlegen commented 1 month ago

I think running in their own plugin is the cheapest way we can use for now. For the GUI, it would be no problem with Worker ones as the response would be instantaneous on a bad configuration, but Actions would take like a minute to validate which would be a bummer for a UI, so that's the part we should figure out. One thing that @whilefoo pointed out is serve the configuration schema through the manifest, which could have been a solution if we find a way to compile the schema in such a way it contains every needed info that could be interpreted by the kernel correctly.

whilefoo commented 1 month ago

Yeah my first idea was that worker plugins could serve configuration schema through the manifest but that's too cumbersome to write in the manifest so instead it could just convert typebox schema to json schema and send it over a dedicated endpoint, but the problem is with action plugins because we can't get a fast and sync response

gentlementlegen commented 1 month ago

@whilefoo Based on what you said, maybe then there would be an approach that would allows both Actions and Workers to have a fast response. Typebox has a codegen tool which allows to transform from the Model to JSON. There is also a tool that allows to transform JSON to TypeBox. Or, codegen supports Model -> TS and TS -> Model but I feel less comfortable serving TypeScript files. However, the JSON tool is a CLI, not a package so maybe it cannot be run within Workers.

For workers, we could simply serve the JSON through an endpoint. For Actions, we could have a script automatically generating the JSON file on push events, so the kernel can simply download the file (which would always be at the same locations for plugins) which would be significantly faster than running an Action.

That can be a path I can explore as well.

0x4007 commented 1 month ago

because we can't get a fast and sync response

Admin can make a config change, the Action runs and posts a comment on the commit a few minutes later BUT it tags the author, so they are notified.

I think this is acceptable as a first version if the better solutions can't be figured out.

gentlementlegen commented 1 month ago

@0x4007 @whilefoo It usually takes ~30s and it does tag the user.

I did some research regarding the usage of a json to describe the configuration. Here is what I found:

So it is possible, but comes with drawbacks.

Functionality JSON Endpoint + Action
Instantaneous response
Environment validation
Decode validation
Detailed errors
Need of an extra build step
Need for an extra Action script

TL;DR JSON is faster, Endpoint + Action give much more accurate errors, so don't know what route we prefer.

0x4007 commented 1 month ago

@whilefoo you can make the decision

whilefoo commented 1 month ago

What do you mean by environment validation?

I think decode validation is not that important and also I think it's more secure if the kernel does validation and not the plugin which can access the configuration.

For Actions, we could have a script automatically generating the JSON file on push events, so the kernel can simply download the file (which would always be at the same locations for plugins) which would be significantly faster than running an Action.

How would the action know where the configuration schema is located in the codebase? Unless the developer sets a path to the file and name of the variable

gentlementlegen commented 1 month ago

Environment validation meaning validating env process.env values, it is possible when it happens on the plugin side.

The decode can be handy, practical example in the plugin name where "test" would be a valid string, but does not properly represent a plugin URL nor an Action path. This is validated during decode, so only can be picked up plugin side.

For the path, we should just rely on a standard location the same way we do for the manifest. Or even have it appended within the manifest itself.

0x4007 commented 1 month ago

Standard location (root, sibling of manifest) seems simplest

ubiquity-os[bot] commented 1 month ago

@gentlementlegen, this task has been idle for a while. Please provide an update.

gentlementlegen commented 1 month ago

@whilefoo @0x4007 So we all agree that it should be changed to JSON? I'd like to be sure before modifying all the linked pull-requests.

0x4007 commented 1 month ago

Endpoint + Action give much more accurate errors

Speed isn't as important as accuracy I think. I would rather one-shot get a list of all the problems to address in the next push.

But it is also not clear to me exactly how inaccurate the speedy approach is.

gentlementlegen commented 1 month ago

Inaccurate for two main reasons:

The JSON still seems better because cross platform (could be reused in a frontend for example) and the two above scenarios might account for < 15% of errors made in the configuration. From my experience, I make more mistakes by doing wrong tabulations, or missing some value in the configuration, or outdated configuration settings due to plugin changes. And the JSON approach makes the code flow simpler and nearly identical for Actions and Workers, easier to maintain.

0x4007 commented 1 month ago

Okay then lets do JSON and then maybe a future task can be a slower one to run async in the background: "advanced checker"

gentlementlegen commented 1 month ago

Will change and test the Cloudflare Worker JSON validator since ajv cannot run within workers. Also will be handy to have this one in since the issue has plenty of pull-requests linked: https://github.com/ubiquity-os-marketplace/conversation-rewards/pull/132

whilefoo commented 1 month ago

Another benefit is that doing validation with JSON avoids using KV

gentlementlegen commented 1 month ago

@0x4007 I was wondering why the generation was not running but it fails every time with

{
      "message": [
        "Error in event handler",
        "Error: No installation found for owner: ubiquity-os-marketplace"
      ],
      "level": "error",
      "timestamp": 1728029469013
    }

because ubiquity-os-kernel-beta does not exist on the marketplace, I guess.

0x4007 commented 1 month ago

I can add to marketplace then

ubiquity-os-beta[bot] commented 1 month ago
! Failed to run comment evaluation. HttpError: Resource not accessible by integration - https://docs.github.com/rest/repos/repos#create-a-repository-dispatch-event
ubiquity-os-beta[bot] commented 1 month ago
! Failed to run comment evaluation. Error: 429 Request too large for gpt-4o in organization org-pwbJ3kbX0h3aoWPTIZrBTgfF on tokens per min (TPM): Limit 30000, Requested 30646. The input or output tokens must be reduced in order to run successfully. Visit https://platform.openai.com/account/rate-limits to learn more.
ubiquity-os[bot] commented 1 month ago
! Failed to run comment evaluation. Error: 429 Request too large for gpt-4o in organization org-pwbJ3kbX0h3aoWPTIZrBTgfF on tokens per min (TPM): Limit 30000, Requested 30646. The input or output tokens must be reduced in order to run successfully. Visit https://platform.openai.com/account/rate-limits to learn more.
gentlementlegen commented 1 month ago

@0x4007 Seems that we hit gtp-4o token limit here. This might be because both development and production ran at the same time, doubling the volume of tokens sent.

ubiquity-os-beta[bot] commented 1 month ago
! Failed to run comment evaluation. Error: 429 Request too large for gpt-4o in organization org-pwbJ3kbX0h3aoWPTIZrBTgfF on tokens per min (TPM): Limit 30000, Requested 31871. The input or output tokens must be reduced in order to run successfully. Visit https://platform.openai.com/account/rate-limits to learn more.
0x4007 commented 1 month ago

Given that the context window is 128 000 that doesn't make sense we're being limited to 30 000 this might be a setting I can change.

Just realized, Ubiquity DAO organization is tier 1 (lowest limits) my personal is tier 4. I suppose it would be ideal to boost Ubiquity DAO tier by actually starting to use it.

https://platform.openai.com/docs/guides/rate-limits/usage-tiers?context=tier-two

I just spent another $10 to bring us to tier 2 territory which should be better!