golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.95k stars 17.53k forks source link

x/pkgsite: remove non-canonical spellings of github.com/BurntSushi/toml #68357

Open arp242 opened 2 months ago

arp242 commented 2 months ago

What is the path of the package that you would like to have removed?

github.com/burntsushi/toml, github.com/burntSushi/toml, github.com/Burntsushi/toml

Are you the owner of this package?

Yes. It's not under my username, but I've been maintaining this for the last 3 or 4 years.

See e.g. commit log, or releases page: https://github.com/BurntSushi/toml/releases

What is the reason that you could not retract this package instead?

These are outdated casing variants from before Go modules enforced this, and are always "stuck" on an old version. The correct package is https://pkg.go.dev/github.com/BurntSushi/toml.

Someone reported these outdated versions list highly in search engine results (https://github.com/BurntSushi/toml/issues/416), which seems to be correct, at least for some queries in some search engines:

cap-2024-07-09T20:53:46

Overall, it doesn't strike me as helpful retaining this.

findleyr commented 2 months ago

Due to https://go.dev/cl/576595, our exclusions are case insensitive: for normal exclusion requests, we want all spellings to be excluded.

So I don't think package removal is the solution here, it seems like we'd need to support some notion of canonicalization, or fix the search engine ranking some other way.

findleyr commented 2 months ago

Not sure what to do here. One solution may be to add a redirect to canonical spellings of module paths, but there could theoretically be module for which the non-canonical spelling is the most popular spelling.

Unfortunately, whatever the solution, it will probably not be simple, which means this is unlikely to get prioritized soon.

seankhliao commented 2 months ago

I suppose the project could publish v0.3.2 (highest version without go.mod + 1) adding a go.mod matching the bad casing plus a deprecation message in go.mod, with a new version for each common bad casing

arp242 commented 2 months ago

there could theoretically be module for which the non-canonical spelling is the most popular spelling.

Perhaps, but fixing just the pkgsite isn't going to break anything for anyone, and not using canonical casing with modules is a pretty broken legacy scenario.

Redirecting to the canonical version should probably be fine. This is also what godocs.io seems to do.

I'm surprised this hasn't come up before, because casing problems like this were somewhat-ish common before modules. You could enforce it with a // import "example.com/Foo" statement, but many repos didn't do that.

findleyr commented 2 months ago

The simplest solution may be to add a different case sensitive type of exclusion rule, and exclude just the popular alternate spellings of this specific module.

CC @jba

seankhliao commented 2 months ago

What about retracting all versions prior to to a go.mod being added? retractions don't require a match of module names

arp242 commented 2 months ago

Also: it's not a huge issue for me. Someone reported it, and I thought it would be a simple fix, but if it's considered too much effort for too little benefit I'm okay with a WONTFIX.

That said, I wouldn't be surprised if this problem also exists for other modules but that no one noticed or reported it. It's taken years for someone to report it to me.

arp242 commented 2 months ago

Another solution might be to prevent indexing with a meta tag and/or add a meta rel="canonical" tag to the canonical casing. This way the existing page still works, but it won't show up in search engines (eventually).


Completely coincidentally I found another example just now; https://duckduckgo.com/?t=ffab&q=go+uuid&ia=web lists https://pkg.go.dev/github.com/google/UUID, rather than https://pkg.go.dev/github.com/google/uuid

cap-2024-07-10T04:08:49

Google seems a bit better at this though.