rust-lang / cargo

The Rust package manager
https://doc.rust-lang.org/cargo
Apache License 2.0
12.59k stars 2.39k forks source link

Expand manifest validation for all settings #4377

Open behnam opened 7 years ago

behnam commented 7 years ago

There are some practical difficulties arising from the fact that manifest files are not always validated. I want to collect such cases, so we can improve the user experience, specially for newbies, by showing warnings about bad manifest entries, as soon as possible.

One important fact here is that we almost never want to throw an error on invalid entries, because it will affect forward-compatibility: would make many future changes non-backward-compatible.

Some Examples

I think almost every config item can have some sort of validation. This list only presents possible validation methods for different types of values.

behnam commented 7 years ago

cc @alexcrichton, @carols10cents

alexcrichton commented 7 years ago

Seems plausible to me! All this validation happens on crates.io though which I'd prefer to not duplicate, although we can perhaps do simple things like validate homepage is a URL locally for sure. (also agreed with warnings-for-now)

behnam commented 7 years ago

Some more background and my rational for this proposal: The other day, for a new release for UNIC, I was updating a dozen of manifests, mostly their metadata. Since it's not vise to package/publish with dirty repository, I had to commit after each publish error, and try publishing again. This resulted in a handful of commits just catching all the small nits from the publish command, like "don't use space in the keywords list".

Agreed that we better have one source of truth. However, I also believe we can benefit a lot by moving this one source of truth to a library that can be used in client side (cargo CLI, clippy, etc), as well as server side (crates.io, etc). I call it the cargo-manifest crate, which I think should spin out of cargo proper (which hosts all the CLIs, etc). This would allow cargo, crates.io and third-party CLIs to share the same manifest read/validation logic, without the need to have cargo execution details imported as dependency.

What do you think?

Boddlnagg commented 7 years ago

https://github.com/rust-lang/crates.io/issues/7250 is also related to this (see point 2.)

Yatekii commented 1 year ago

This is a painpoint that happens often to me.

Would it be possible to implement a preflight like endpoint in crates.io to do the validation before the publish?

I would love to work on this.

weihanglo commented 1 year ago

Would it be possible to implement a preflight like endpoint in crates.io to do the validation before the publish?

The solution should come from a discussion with crates.io team. I don't have the answer to it, but Zulip t-crates.io is a better place to have such a chat with crates.io team I believe.

Rustin170506 commented 10 months ago

The solution should come from a discussion with crates.io team. I don't have the answer to it, but Zulip t-crates.io is a better place to have such a chat with crates.io team I believe.

Would it be possible to implement a preflight like endpoint in crates.io to do the validation before the publish?

In our previous crates.io team meeting, we expressed concerns regarding the potential performance issues that could arise from introducing this validation to the publish API. The API is resource-intensive as it requires unzipping the tarball to retrieve metadata, which could also lead to the consumption of your rate limit.

Thus, we're inclined first to unify the validation rules of Cargo and crates.io. This would be a good initial step, and I am currently working on this.

See: https://github.com/rust-lang/crates.io/pull/7379

Yatekii commented 10 months ago

Even if the rules are unified, a divergence between crates.io and cargo can occur if the cargo version is outdated.

I fail to see how uploading a bit of metadata via some simple API request is a problem. Admittedly I do not know how the API looks like at all.

epage commented 10 months ago

In going through cargo's triage (or talking to people about it), I remember seeing somewhere that some differences were intentional.

An alternative route to go, once we have #12235, is to have the crates-io specific validation rules in a crate and a cargo linter would expose lints for crates-io and "all registries" published packages that don't conform to those validation rules. During the times where crates-io is out of sync with someone's MSRV, they can allow certain lints.

epage commented 10 months ago

I fail to see how uploading a bit of metadata via some simple API request is a problem. Admittedly I do not know how the API looks like at all.

For the publish workflow, crates.io has been shifting away from "trust the metadata" to "parse Cargo.toml". That is what is expensive. Yes, there are a lot of design decisions that can be made, with various trade offs, that might get us to having server-side validation.

For example,

Yatekii commented 10 months ago

Yeah I see that sending just an extract might be error prone.

Also, upon further reflection, I think just syncing the two checkers or at least having a crates.io-oriented validation crate would be enough. For people that do not want to keep up with the release cadence of cargo it will fail, but for everyone else this is perfect :)

epage commented 8 months ago

One thought on how to solve this

One question is when should these lints fire. With publish = true (which is default though we've talked about changing that in #6153), all registries are matched. At least with #12786, we have truly private packages, more easily. So do we emit them on all commands, even if they won't apply? Do we reserve this for package and publish but then people won't find out until its too late? Do we do this in cargo's linter that we've talked about? imo no real solid "this is the way" answer.

In the lint, we should