Manishearth / namespacing-rfc

RFC for Packages as Optional Namespaces
47 stars 3 forks source link

Drawback: typosquatting #3

Open Manishearth opened 3 years ago

Manishearth commented 3 years ago

The current proposal allows for dash typosquatting

This proposal does not prevent anyone from taking foo-bar after you publish foo/bar. Given that the Rust crate import syntax for foo/bar is foo_bar, same as foo-bar, it's totally possible for a user to accidentally type foo-bar in Cargo.toml instead of foo/bar, and pull in the wrong, squatted, crate.

We currently prevent foo-bar and foo_bar from existing at the same time. We could do this here as well, but it would only go in one direction: if foo/bar exists foo-bar/foo_bar cannot be published, but not vice versa. This limits the "damage" to cases where someone pre-squats foo-bar before you publish foo/bar, and the damage can be mitigated by checking to see if such a clashing crate exists when publishing, if you actually care about this attack vector. There are some tradeoffs there that we would have to explore.

One thing that could mitigate foo/bar mapping to the potentially ambiguous foo_bar is using something like foo::crate::bar or ~foo::bar or foo::/bar in the import syntax.

carols10cents commented 3 years ago

Ok I haven't thought through the complete implications of this yet; will think some more later. But what if we introduce a new syntax in Cargo.toml that will map to "foo/bar" (and old Cargos can use "foo/bar" to depend on crates like this, I feel like supporting old cargos is probably where this idea is going to fall down but stick with me for a second)

either something like:

[dependencies]
serde = { sub_crate = "json", version = "1.0" }

or:

[dependencies.serde]
version = "1.0" # to depend on the parent crate, omit this if you aren't depending on the parent
json = "1.0" # to depend on the serde/json crate

to make it more obvious and explicit and different when your intention is to use a sub-crate?

Manishearth commented 3 years ago

@carols10cents I don't see how this proposal fixes typosquatting, in essence the crate will still be serde_json in rust code. serde/json vs serde = { subcrate = json } both seem to have the same amount of typosquatting issues, except in the latter you have to type a lot more and people need to

[dependencies.serde]
version = "1.0" # to depend on the parent crate, omit this if you aren't depending on the parent
json = "1.0" # to depend on the serde/json crate

We can't do this because this locks us out of ever adding more keys to the dependency spec format

Manishearth commented 3 years ago

My personal take on this is that typosquatting is not that big a deal and we mostly should decide on whether registering foo/bar blocks out the registration of foo-bar or not. But if we want to solve it, I'd prefer to not lose the UX of "it's just the name of the crate" in the process.

pksunkara commented 3 years ago

if foo/bar exists foo-bar/foo_bar cannot be published, but not vice versa

To be clear, this means the following:

If I have foo crate, and someone publishes foo-bar, I can still publish foo/bar. Now, there are two crates that resolve to foo_bar.

This is okay if we support the foo/bar on crates.io, docs.rs, cargo etc.. I can use package renaming in Cargo.toml if I want to use both at the same time.

The only issue is that the wording seemed quite confusing at first until I really sat down to think on this. We might want to expand the section and add examples of the behaviour.

I don't think this will be an issue but I am worried about other tools which might get affected because of two crates resolving to foo_bar.

Manishearth commented 3 years ago

It is already possible to make a crate resolve to an arbitrary library name, fwiw. It's only a problem if you are trying to use the two crates in cargo.toml. Crates.io, docs.rs, etc are unaffected. Tooling should be fine, the only issue is cargo.toml, and that's handled by renaming.

pksunkara commented 3 years ago

It is already possible to make a crate resolve to an arbitrary library name, fwiw. It's only a problem if you are trying to use the two crates in cargo.toml. Crates.io, docs.rs, etc are unaffected. Tooling should be fine, the only issue is cargo.toml, and that's handled by renaming.

I agree.

I don't think this will be an issue but I am worried about other tools which might get affected because of two crates resolving to foo_bar.

What I meant is the behaviour of other non-official tools that use crates.io.

joshtriplett commented 3 years ago

We were discussing aspects of this in the Cargo meeting today, and in general, I would expect a dependency on a/b in Cargo.toml to map to the name b in Rust, never a_b. If someone wants the latter, they can always use crate renaming.

That would reduce the ambiguity between foo-bar (or foo_bar) and foo/bar. The bigger concern would be if there's a top-level crate bar, and that's kinda the point of all this.

Manishearth commented 3 years ago

@joshtriplett idk, in some cases it makes sense for that to be the crate name, but in most cases for me i think that disambiguator is good -- looking at the lot of the foo_ "namespaced" crates out there already removing the "namespace" gives you a crate name that doesn't quite make sense. It does for serde/json, but not for regex/syntax or tokio/core or tokio/util

Also yeah this makes the typosquatting situation worse since it's no longer typosquatting it's just squatting. I think the fact that, by default, looking at a piece of rust code tells you what crates you need is good -- losing this property opens us up to confusion and potential malice when someone looks at rust code using foo/bar and concludes they need the bar crate in their own code.

ShadowJonathan commented 2 years ago

With talk in #1 pivoting towards using ::, is this a real concern still?

Manishearth commented 2 years ago

No, but most of these issues should be ignored at this stage of the RFC