lycheeverse / lychee

⚡ Fast, async, stream-based link checker written in Rust. Finds broken URLs and mail addresses inside Markdown, HTML, reStructuredText, websites and more!
https://lychee.cli.rs
Apache License 2.0
2.17k stars 131 forks source link

lychee.toml JSON schema #1382

Open o-az opened 8 months ago

o-az commented 8 months ago

Is there a schema for the lychee.toml config file? I pass config schemas for toml config files to get autocomplete using taplo https://taplo.tamasfe.dev/configuration/directives.html#the-schema-directive

mre commented 8 months ago

There is not, and I didn't know that was a thing, but I'd be thankful for a pull request to add one.

o-az commented 8 months ago

There is not, and I didn't know that was a thing, but I'd be thankful for a pull request to add one.

I'll happily contribute. Is there an example config file that uses all fields possible?

mre commented 8 months ago

Hm, the most complete one is probably https://github.com/lycheeverse/lychee/blob/master/fixtures/configs/smoketest.toml. Would that work? Alternatively, all the options are here: https://github.com/lycheeverse/lychee/blob/13f4339710d76831d9daf961584d796cee4847d2/lychee-bin/src/options.rs#L152

bollwyvl commented 1 month ago

Perhaps schemars could be used to get this more or less "for free," in that future additions to Config would automatically be added to the schema.

mre commented 1 month ago

Oh yeah, that's definitely nice. How would we build the schema? We could have a separate build target for that, which would store the schema in a file:

let schema = schema_for!(Config);
println!("{}", serde_json::to_string_pretty(&schema).unwrap());

let mut file = File::create("lycheee-schema.json").expect("Unable to create file");
file.write_all(json_output.as_bytes()).expect("Unable to write data");

Ideally, we would do that automatically during the build process, to avoid that the schema runs out of sync with the code. Not sure if we could do that with a build.rs, but it gets hairy because we'd have to use include! to get access to the types from the build.rs. Any other ideas?

bollwyvl commented 1 month ago

Sadly I know less about rust than schema!

A test-driven might be:

Another test could verify that everything has the amount of docs that would be useful to a user, e.g. everything has a description, all strings have at least an format, enum, pattern or example.

Again from a user perspective, it's pretty important that the versioned file ends up somewhere publicly hosted and versioned, but https://raw.githubusercontent.com/lycheeverse/lychee/v0.15.1/docs/schema.json isn't the worst.

roberth commented 1 month ago

Alternatively, you could add a hidden (or not) subcommand to the CLI to print out the schema, and then call that in your documentation site build or release process.

If you don't want to add such infrastructure, I think the @bollwyvl's test approach is pretty good.

Oh, or you could do schema-first, generating Rust code for the schema. Or maybe that would be a lossy refactor; I'm not familiar with the code.

bollwyvl commented 1 month ago

call that in your documentation site build

As long as this resulted in a permanent URL for a given version, yes, that's better branding:

"$schema" = "https://lychee.cli.rs/schema/v0.15.1/schema.json"

is prettier than:

"$schema" = "https://raw.githubusercontent.com/lycheeverse/lychee/v0.15.1/docs/schema.json"

Either way, having it checked in is really important.

schema-first

Again, know very little rust beyond cargo build, but typify claims to do this. I think that use case is more optimized for when you already have a schema to which you need to conform, e.g. an OpenAPI contract. The complexity of actually validating against a schema might be higher than needed for a config file.