codex-storage / nim-codex

Decentralized Durability Engine
https://codex.storage
Apache License 2.0
65 stars 25 forks source link

Explicit limitations #789

Open benbierens opened 6 months ago

benbierens commented 6 months ago

Many of our API values have implicit limits: nodes, tolerances, durations, expiries, max slot sizes, etc. It would be very nice if the application itself could report these limitations directly from the API somehow. So that builders of frontends can query and have accurate up-to-date limits for their values for their version of the codex node.

AuHau commented 5 months ago

Honestly, the more I am digging around Marketplace spec, and other documents and running into this "limits for things", the more I like your idea about exposing these limits through API, because there is A LOT of these "magic constant limits"...

gmega commented 5 months ago

Agreed this is a pain and needs a solution. One way to do it would be encoding it into the OpenAPI spec, which has the advantage of being a standard, machine-readable format. For simple constraints you can just hardcode them in the yaml, for more complex (e.g. dynamic) ones you can enrich the yaml with current numbers as you serve it over the openapi endpoint.

AuHau commented 5 months ago

Well, the beauty of the API solution is that you already have these limits somewhere in the codebase as constants, which are actually the limits that are used, so exposing these constants guarantees you that the limits are actually the current ones.

Using OpenAPI for this would make it just another place that you would have to update when updating the constants, and most probably you would actually forget about that...

gmega commented 5 months ago

We shouldn't mix up the fact that we currently have a static OpenAPI spec with the fact that OpenAPI is a standard API metadata format, and can be highly dynamic. To make it into an actual API clients can consume, you just need to expose it over an endpoint. Modern web frameworks (e.g. FastAPI) can generate it completely from code. Absent that, you can still easily inject the required bits from code as part of the build (to keep constants in sync), or as you serve the requests (if you need really dynamic stuff).

benbierens commented 5 months ago

One thing that's not helping is that many of our numeric parameters are modeled as string. I would guess this is because we need to support UInt256:

        minPrice:
          type: string
          description: Minimum price to be paid (in amount of tokens) as decimal string

In my view, the 'ultimate' solution would look something like this: OpenAPI.yaml:

        nodes:
          type: number
          description: Minimal number of nodes the content should be stored on
          default: 1
          minimum: 1
          maximum: 1024

minimum and maximum are available for numbers in the OpenAPI spec and are inclusive ranges by default. For strings, there are minLength, maxLength, various default formats, and the option to specify a regex to constrain them.

The OpenAPI.yaml would be automatically generated from code by the http-server library we use. (That's nim-presto, I think. (Pretty sure it can't currently do this.)) It would be stored in the repo exactly like it currently is. AND there would be an API call on Codex that would return the OpenAPI.yaml. If there's a different spec that's more suitable to 'live reflecting' of the API, that sounds fine to me.

I cannot agree more that it's becoming increasingly important to have a single source of truth w.r.t. our API. Ideally that'd be the code, because compatibility would be guaranteed. But if this is outside our technical reach right now, perhaps the OpenAPI.yaml could be leading, and we could generate our nim server implementation of the spec instead. (Perhaps the integration tests could generate their client implementation from the yaml and help check it for errors that way, dist-tests already do this.)

What do you think we can and should do first? What's within reasonable tech-reach? What would give enough of a benefit?

AuHau commented 5 months ago

I haven't really considered OpenAPI in the way you've described it, i.e., being dynamically served over HTTP. It's an intriguing idea, but IMHO it does have a few caveats.

To me, the most significant downside is the complexity inherent in an implementation as Ben described. I expect that it will require a substantial amount of effort to implement, which IMHO shouldn't be our priority at the moment.

We could adopt a more flexible strategy. For instance, we might not need 'nim-presto' to generate OpenAPI based on route definitions (although that would be the ultimate goal). We could maintain the current OpenAPI.yaml that we're manually updating and insert templated placeholders for the constants (for example, {{STORAGE_REQUEST_DURATION_LIMIT}}), which the 'return OpenAPI' endpoint handler could replace.

Another issue I see is that I don't want to communicate only "limits to parameters". I'd also like to convey specific network parameters so that, for instance, the UI could use them. One desired feature is exposing the configuration values of the Marketplace smart contracts. The UI could use this information to provide details about slashing, such as "You've missed two proofs and have only three left before slashing will occur" or "You can be slashed three more times before losing your slot." This kind of communication isn't entirely feasible without hardcoding these values into the UI from the deployed smart contract information (which can change), or having the node return this information through an API. This sort of thing IMHO is not really suited for OpenAPI...

gmega commented 5 months ago

Wait, let's not mix up API metadata with the API itself. :-) If you want to describe what's in a smart contract, you'll need an API for that. This API will have parameters and a response format. The schema, attributes, meanings, and some constraints for these parameters and response format can be described in OpenAPI. If you have extra constraints that need to be sent to the frontend that it needs to present to users (e.g. max number of slots), you could also use OpenAPI for that.

I'm also not suggesting we start by implementing this in presto - yaml is just a flavour of JSON. You can parse the base spec which you wrote manually, add the constraints from code, and serve it over the API metadata endpoint. Or you can use a template as you said. It doesn't have to be a herculean effort.

Anyhow, I think we need to look at the actual use cases to see what makes sense. If you're describing a resource, then yes, that's an API endpoint. If you're describing constraints on the API that accesses that resource and on the data representation of that resource, then probably you want to do that using OpenAPI so you don't reinvent a non-interoperable subset of it.

emizzle commented 5 months ago

I really like the idea of serving out the OpenAPI yaml with templated placeholders that are dynamically replaced. If we ever get time, we can dynamically generate the entire OpenAPI spec, but at this point, I think the manual + template placeholders is a good compromise.

I'm on the fence about returning Marketplace config information, eg "you have two slashings left". On one hand, this info is freely available by directly querying the contract, and on the other hand, the codex node could proxy the query to the contract, and compute some values to return, with the computed values being useful information for node operators or frontend sites like block explorers.

As this is not integral to the functioning of a codex node, part of me thinks that we could develop out these computed values in a js library, that consumes Codex endpoint info and contract info, and computes useful information. This lib could then be consumed by any frontend wanting to display or use this info.