Closed gentlementlegen closed 3 weeks ago
Since this might imply quite a few network requests
I think its fine given that we can make 5000 requests per hour, per organization, before we get rate limited as an app. Lets keep it simple and not do the separate Worker. Config changes do not happen regularly.
@0x4007 What you say is valid for the GitHub API. But Workers are limited to 50 requests per run / instance.
We can do this in a plugin instead?
I thought about it, but it means that the plugin should have access to the private repo containing the configuration and should be able to read / write which seems dangerous. Or maybe we can run that plugin within the configuration repo itself. But if we do so we cannot handle per repo based configurations I think
Or maybe we can run that plugin within the configuration repo itself.
Cool idea
But if we do so we cannot handle per repo based configurations I think
Research will reveal the answer!
/start
Warning! | This task was created over 33 days ago. Please confirm that this issue specification is accurate before starting. |
Deadline | Thu, Sep 5, 6:39 AM UTC |
Registered Wallet | 0x0fC1b909ba9265A846b82CF4CE352fc3e7EeB2ED |
<ul>
<li>Use <code>/wallet 0x0000...0000</code> if you want to update your registered payment wallet address.</li>
<li>Be sure to open a draft pull request as soon as possible to communicate updates on your progress.</li>
<li>Be sure to provide timely updates to us when requested, or you will be automatically unassigned from the task.</li>
<ul>
What should be done beyond the kernel side changes: Each plugin should have an entry point for validation on Workers, and an Action script for Actions.
Each plugin should have an entry point for validation on Workers, and an Action script for Actions.
Can you elaborate this isn't clear to others
The kernel workflow should be the following:
The v1 had a similar workflow. I think this is important to have this functionality specially for newcomers that could be confused about the configuration, or avoid breaking it during development and updates.
@0x4007 I've seen you mentioning an issue about the base multiplier having changed for v2, and indeed it was reset to 1
. Put it back to 3
as it was in v1.
@0x4007 Would something similar to the following format be satisfactory? It gives the error and link the corresponding line with a preview:
Not sure let's try. Make sure it comments on the commit AND tags the author to urgently notify them that there is something wrong.
Yes the comment above was a comment on the commit. Example test run: https://github.com/Meniole/ubiquibot-config/commit/f64b4023cbff5b45b4ae39ab0fb1fadb30db83d6#commitcomment-146939444
Also use the
[!CAUTION]
my message
syntax
The red one whatever the syntax is
Progress update: I finally got the Action plugins to validate themselves, was trickier because everything is asychronous. So I had a design question: since I have to listen for Actions to finish validating, currently what happens is that each plugin will add its own message, like so:
Maybe that would be too noisy since a user would get tag each time a plugin is done validating. So maybe what we can consider:
Downsize of reducing noise by editing the message is that every plugin will be async and maybe errors keep coming after the user thought the configuration was valid. What do you think? Another run example: https://github.com/Meniole/ubiquibot-config/commit/3e9152e3eadd98ef20b10bfba7529533ed392c46
Yes do both of your suggestions
Oh I thought the validator is one and done inside the kernel. This approach across every plugin seems wrong.
This can be dynamic by reading the plugin ajv validator files
@0x4007 I think it makes more sense to have it within the plugin itself, we could even have it within our SDK for simplicity. I do not see how the kernel can understand plugin configurations.
My idea was to import and run the ajv validation code. Could be risky but maybe there's a way to quarantine it in a way that's not too complex.
We do not use ajv
anywhere, because ajv
cannot run within workers. Beyond being very risky and prone to code injection, some plugins have very complex configurations spread across multiple files, like conversation-rewards
how could we handle this? Some other plugins have runtime encode / decode
, or types generated through import of third party libraries (for example GitHub roles that are generated from an enum). That seemed way too complex to just read the configuration file, it is much simpler to reuse the code already implemented within the plugin itself, since every plugin already validates its own payload.
Also, in the case of workers, the kernel does not know which repository they refer too, so we should serve the file as plain text on the endpoint which doesn't seem elegant.
Another advantage of using directly the plugin is that currently it also detects invalid GitHub / Worker environment variables which is very helpful.
I think we need to figure this out eventually because of the marketplace/plugin installer feature
I think it is quite straightforward for Workers, too slow for Actions. Maybe eventually we will need to have an endpoint for all the plugins. I don't see how we can have the kernel itself handle this because:
If we can solve all of these then the kernel should be able to just import the configuration type files and read them.
My vision is not too different from a Docker-like approach. We can spin up a virtual shell as a child process, which then runs its own node.js instance.
I think it is quite straightforward for Workers, too slow for Actions. Maybe eventually we will need to have an endpoint for all the plugins. I don't see how we can have the kernel itself handle this because:
- the kernel should parse the included files and retrieve all of them
I think we just need to send everything, although I'm not fully understanding the context of this statement.
- in case some external library is linked, the kernel should install it
As in, npm
? Knip
might be able to help compile these, or there may be some better tools.
- that open the possibility to inject code, and extremely easy to break as well
Code injection might be acceptable within a virtual shell.
- we cannot safely read the environment of plugins either, because we would leak secrets
The virtual shell should not have access to the environment secrets.
If we can solve all of these then the kernel should be able to just import the configuration type files and read them.
Workers do not run node.js
but a custom cloudflare-like minimal implementation. Workers do not allow to run external code for security reasons, nor allow child processes, or shells, afaik.
I do not understand the benefit of heavily complexifying this on the kernel, when we already have running endpoints that support all the logic (and Actions are literally dockers themselves). Also I do not understand what Knip is needed for?
Practical example:
I need to check if the configuration for conversation-rewards
is valid. Here is the configuration file:
https://github.com/ubiquibot/conversation-rewards/blob/development/src/configuration/incentives.ts
How can I check that the configuration provided is valid against this?
One expensive idea is to consume the type with o1-mini
and then have it post a comment if it thinks that it won't work. I suppose it would unfortunately have to consume the entire plugin codebase before determining this though.
It would be a bit cheaper if we standardize the location of the payload type checker file.
This seems like a bad approach to scale. We could consider making this a standalone plugin which is expensive to run but we can allow partners to opt-out if its too expensive?
In conclusion I don't see a great cheap solution. I think we should handle it inside of the plugins as you are doing, and then in the future we automate plugin configuration using a GUI which can prevent misconfiguration? Perhaps we enforce a standard for plugin developers to follow for it to populate on the GUI and to be configurable?
I think running in their own plugin is the cheapest way we can use for now. For the GUI, it would be no problem with Worker ones as the response would be instantaneous on a bad configuration, but Actions would take like a minute to validate which would be a bummer for a UI, so that's the part we should figure out. One thing that @whilefoo pointed out is serve the configuration schema through the manifest, which could have been a solution if we find a way to compile the schema in such a way it contains every needed info that could be interpreted by the kernel correctly.
Yeah my first idea was that worker plugins could serve configuration schema through the manifest but that's too cumbersome to write in the manifest so instead it could just convert typebox schema to json schema and send it over a dedicated endpoint, but the problem is with action plugins because we can't get a fast and sync response
@whilefoo Based on what you said, maybe then there would be an approach that would allows both Actions and Workers to have a fast response. Typebox has a codegen tool which allows to transform from the Model to JSON. There is also a tool that allows to transform JSON to TypeBox. Or, codegen supports Model -> TS and TS -> Model but I feel less comfortable serving TypeScript files. However, the JSON tool is a CLI, not a package so maybe it cannot be run within Workers.
For workers, we could simply serve the JSON through an endpoint. For Actions, we could have a script automatically generating the JSON file on push events, so the kernel can simply download the file (which would always be at the same locations for plugins) which would be significantly faster than running an Action.
That can be a path I can explore as well.
because we can't get a fast and sync response
Admin can make a config change, the Action runs and posts a comment on the commit a few minutes later BUT it tags the author, so they are notified.
I think this is acceptable as a first version if the better solutions can't be figured out.
@0x4007 @whilefoo It usually takes ~30s and it does tag the user.
I did some research regarding the usage of a json
to describe the configuration. Here is what I found:
typebox-codegen
cannot be used because it generates code files, and Cloudflare forbids the use of eval
, exec
, Function
and so on, so cannot run arbitrary codeFunction
to read the schemas so it cannot run within workers (https://github.com/ajv-validator/ajv/issues/2318). There seems to be an alternative that I didn't try out yet: https://github.com/cfworker/cfworker/tree/main/packages/json-schemaSo it is possible, but comes with drawbacks.
Functionality | JSON | Endpoint + Action |
---|---|---|
Instantaneous response | ✅ | ❌ |
Environment validation | ❌ | ✅ |
Decode validation | ❌ | ✅ |
Detailed errors | ✅ | ✅ |
Need of an extra build step | ✅ | ❌ |
Need for an extra Action script | ❌ | ✅ |
TL;DR JSON is faster, Endpoint + Action give much more accurate errors, so don't know what route we prefer.
@whilefoo you can make the decision
What do you mean by environment validation?
I think decode validation is not that important and also I think it's more secure if the kernel does validation and not the plugin which can access the configuration.
For Actions, we could have a script automatically generating the JSON file on push events, so the kernel can simply download the file (which would always be at the same locations for plugins) which would be significantly faster than running an Action.
How would the action know where the configuration schema is located in the codebase? Unless the developer sets a path to the file and name of the variable
Environment validation meaning validating env process.env
values, it is possible when it happens on the plugin side.
The decode can be handy, practical example in the plugin name where "test" would be a valid string, but does not properly represent a plugin URL nor an Action path. This is validated during decode, so only can be picked up plugin side.
For the path, we should just rely on a standard location the same way we do for the manifest. Or even have it appended within the manifest itself.
Standard location (root, sibling of manifest) seems simplest
@gentlementlegen, this task has been idle for a while. Please provide an update.
@whilefoo @0x4007 So we all agree that it should be changed to JSON
? I'd like to be sure before modifying all the linked pull-requests.
Endpoint + Action give much more accurate errors
Speed isn't as important as accuracy I think. I would rather one-shot get a list of all the problems to address in the next push.
But it is also not clear to me exactly how inaccurate the speedy approach is.
Inaccurate for two main reasons:
OPENAI_API_KEY
is missing for example)The JSON
still seems better because cross platform (could be reused in a frontend for example) and the two above scenarios might account for < 15% of errors made in the configuration. From my experience, I make more mistakes by doing wrong tabulations, or missing some value in the configuration, or outdated configuration settings due to plugin changes. And the JSON
approach makes the code flow simpler and nearly identical for Actions and Workers, easier to maintain.
Okay then lets do JSON and then maybe a future task can be a slower one to run async in the background: "advanced checker"
Will change and test the Cloudflare Worker JSON validator since ajv
cannot run within workers. Also will be handy to have this one in since the issue has plenty of pull-requests linked: https://github.com/ubiquity-os-marketplace/conversation-rewards/pull/132
Another benefit is that doing validation with JSON avoids using KV
@0x4007 I was wondering why the generation was not running but it fails every time with
{
"message": [
"Error in event handler",
"Error: No installation found for owner: ubiquity-os-marketplace"
],
"level": "error",
"timestamp": 1728029469013
}
because ubiquity-os-kernel-beta
does not exist on the marketplace, I guess.
I can add to marketplace then
! Failed to run comment evaluation. HttpError: Resource not accessible by integration - https://docs.github.com/rest/repos/repos#create-a-repository-dispatch-event
! Failed to run comment evaluation. Error: 429 Request too large for gpt-4o in organization org-pwbJ3kbX0h3aoWPTIZrBTgfF on tokens per min (TPM): Limit 30000, Requested 30646. The input or output tokens must be reduced in order to run successfully. Visit https://platform.openai.com/account/rate-limits to learn more.
! Failed to run comment evaluation. Error: 429 Request too large for gpt-4o in organization org-pwbJ3kbX0h3aoWPTIZrBTgfF on tokens per min (TPM): Limit 30000, Requested 30646. The input or output tokens must be reduced in order to run successfully. Visit https://platform.openai.com/account/rate-limits to learn more.
@0x4007 Seems that we hit gtp-4o
token limit here. This might be because both development
and production
ran at the same time, doubling the volume of tokens sent.
! Failed to run comment evaluation. Error: 429 Request too large for gpt-4o in organization org-pwbJ3kbX0h3aoWPTIZrBTgfF on tokens per min (TPM): Limit 30000, Requested 31871. The input or output tokens must be reduced in order to run successfully. Visit https://platform.openai.com/account/rate-limits to learn more.
Given that the context window is 128 000
that doesn't make sense we're being limited to 30 000
this might be a setting I can change.
Just realized, Ubiquity DAO organization is tier 1 (lowest limits) my personal is tier 4. I suppose it would be ideal to boost Ubiquity DAO tier by actually starting to use it.
https://platform.openai.com/docs/guides/rate-limits/usage-tiers?context=tier-two
I just spent another $10 to bring us to tier 2 territory which should be better!
In the bot v1, when the configuration is changed, annotations are added to the configuration file if any error is encountered within the configuration. It would be really nice to have for v2 as it is a common scenario to have an invalid configuration and have no feedback about it because the error would only be visible within the worker logs.
The main challenge is that only plugins are aware of their configuration shape, so probably the kernel should call every plugin to check their validity, which would require an endpoint or some access to the configuration validators. At the same time, the
manifest.json
validity could be checked, relating to https://github.com/ubiquity/ubiquibot-kernel/issues/78.Since this might imply quite a few network requests, we can also consider having this functionality as a separate Worker using service binding.