Open wildum opened 3 months ago
Can you post up a snippet of some of the more complex components, windows_exporter
has to have one of the largest configs?
This generated version does not contain the "unique" field for blocks. I don't know if it's something we should add or not in the format. It's not useful for autocompletion but it could be useful for validation and visual scripting
I'm happy with this, and I'm very excited to see how we can use it for autogenerating docs. I wonder how we are going to treat pointer attributes and blocks. Atm, if a block is a pointer, then its defaults won't actually be set. They'll only be set if it's not a pointer. We don't document this very clearly, and this is a change to iron out such issues in the docs.
It might be helpful if we make our schema look like a standard json schema, but it's not a must.
One of the main issues with the current proposal is that it doesn't include any sort of validation for config attributes. This could reduce the usefulness of the schema for validating configs. Should we include such a feature?
For example, in Json schema there are number validations such as "multipleOf" : 0.01
, "minimum": 0
, etc.
One of the main issues with the current proposal is that it doesn't include any sort of validation for config attributes. This could reduce the usefulness of the schema for validating configs. Should we include such a feature?
That's a good point, but I think that for almost all numbers the constraints are obvious (fe: timeout
should not be negative). Where the constraints might help the most is for strings when only specific values are allowed (fe: the role
in discovery.dockerswarm must be either "services", "nodes" or "tasks").
Not sure if this should be part of the first version
It might be helpful if we make our schema look like a standard json schema, but it's not a must.
Would you then use it to check if the generated components file is correct according to the schema? It's an interesting idea but I think that I would only use it for testing or would you use it differently?
I wonder how we are going to treat pointer attributes and blocks. Atm, if a block is a pointer, then its defaults won't actually be set. They'll only be set if it's not a pointer. We don't document this very clearly, and this is a change to iron out such issues in the docs.
I think that this does not apply to this file, we should treat pointers the same way as values for attributes and blocks because there is no difference between the two in the config. The generated file should be used for config tools. The underlying logic such as how the defaults are applied for pointers should be described in the documentation. If we generate the doc, there could be a warning for the pointer types but I wouldn't include it in this file
Couple of points:
make generate
and have a test that will fail if the generated JSON is not up-to-date? This would be similar to the related components section generation in the docs right now.@thampiotr thanks for the suggestions, added everything
Actually, I think we should evaluate json schema in a bit more detail first. Last week I OK'd the proposal as it is, because I felt pessimistic that upstream's schematisation project will complete soon due to technical difficulties with generating code that's not too dissimilar. But I've since been able to overcome a few hurdles withe the code, and now I feel much more optimistic. Upstream's maintainers are also receptive to having the feature.
Would you then use it to check if the generated components file is correct according to the schema? It's an interesting idea but I think that I would only use it for testing or would you use it differently?
Using json schema has a few advantages:
otelcol
ones).It would be nice if we can explore further whether json-schema is not sufficient for our purposes, and if not, to know why.
Note that it's also possible to extend the schema with parameters which are not in the spec. It's a common problem, and it's likely that OTel will do this upstream to support "secret" strings:
password:
type: string
alias: configopaque.String
I tried generating a JSON Schema from the component example that I put above. Is this what you have in mind?
Would be for using the json schema, its a bit wordy but this should in general be tools consuming this and not directly by people.
Is this what you have in mind?
@wildum More or less yes. An Alloy block would have to be an object. I'm not sure if the example in your comment is exactly how the schema needs to be, but it certainly looks like a good start.
I'm not fully bought on the JSON schema idea. The schemas are used to validate files but this is not what we are looking for in this proposal: we want to provide components metadata to populate config tools.
Although it provides some structure to the JSON file, I feel like we are misusing the concept, making it confusing. As a config tools builder, I would prefer to have a simple interface description such as the one defined in the proposal. It's easy to implement in any language and does not require additional knowledge.
Even if the file should be consumed by tools and not people, it's still useful sometimes to check through the data for debugging purposes and the JSON schema style is harder to read than the other style.
I think that validation is interesting but it's a big topic with a lot of unknowns. I would prefer to keep it out of scope for this proposal.
I don't necessarily want to close the door on the JSON schema. If we go all the way with JSON schema in Otel and Alloy, then it might be worth considering here.
So I suggest that with this proposal we only commit to the fact that we want to expose a generated metadata JSON file containing the component data that is described in the PR, in the Git repo. The structure of the data (whether it follows the JSON schema semantic or not) will be defined when we start discussing how we want to generate the data.
I'd be happy if we proceed without using json schema. The solution proposed here seems easy enough to build and could unblock the development of various tooling. It'd be interesting to see what tools will be built for it, and what constraints they hit - then we can evaluate if we need to change our approach.
Background
A parsable file representing Alloy components would enable the creation of config tools such as autocompletion or visual scripting.
The concept is very similar to the one from @thampiotr that was abandoned: https://github.com/grafana/agent/pull/5863
This proposal focuses on the generated file. Another proposal will address how the file will be generated if this proposal is accepted.
Proposal
I propose to have a generated JSON file representing Alloy components that exposes the following data:
I tested with a generator based on Alloy documentation and the generated JSON file was 1.1MB (33k lines). I picked JSON for its parsing speed, memory efficiency, and native support in JS (I expect most tools like autocomplete and visual scripting to use JS).
The file will be generated via the
make generate
command. Tests will be added to check that the file is up to date in a similar way as the related components section generation. The file will be available directly in the GitHub repo for users to download (they can download the different versions on the release branches).The file could additionally be generated via the command line with the Alloy binary.
All components and config blocks should be described in the file.