racket / racket-pkg-website

A frontend for the Racket Package Catalog.
Other
9 stars 15 forks source link

Re-enable author display using hashed emails #88

Closed LiberalArtist closed 1 year ago

LiberalArtist commented 1 year ago

This patch restores package author information to the web UI without exposing email addresses to spammers. Email addresses are obfuscated by combining the local-part of the email address (i.e. the part to the left of the @, per RFC 5322 § 3.4.1) with the first seven hexadecimal digits of the SHA-256 hash of the full address. Thus, for example, the email address:

philip@philipmcgrath.com

produces the obfuscated display name:

philipλ9411372

Compared to alternative approaches, this design seems to have some nice properties:

  1. We can continue to use email addresses as the “primary key” to identify users, rather than having to create and maintain a registry for usernames in a global namespace.

  2. In contrast to my comment on https://github.com/racket/racket-pkg-website/issues/77#issuecomment-1429252878 proposing an optional user-specified display name (potentially with a disambiguator for collisions), this patch uses the identifier all users already have chosen. All users get meaningful, yet obfuscated, display names by default, with no action needed. We have no new data fields to store or validate.

  3. There is no ambiguity between email addresses and obfuscated display names.

  4. Anyone who knows an email address can compute the corresponding obfuscated display name and thus can search for packages associated with it. In some contexts one would want to use a UUID or a salted hash to avoid making that information discoverable. Here, though, our goal is to protect package authors from being spammed, not to conceal the authorship of Racket packages.

I took some inspiration from “Discord tags”/“discriminators” and from identicons.

A natural extension would be to add a simple boolean preference allowing users to choose whether to obfuscate their email addresses or not. I’ve designed with that possibility in mind, but I also hope this patch would be useful even without the further enhancement.

Related to https://github.com/racket/racket-pkg-website/issues/77 Related to https://github.com/racket/racket-pkg-website/pull/86 Related to 31aa7b0236a0597dc14cabd2b399a42723e93401 Related to https://github.com/racket/racket-pkg-website/issues/87

Screenshot of the package index page showing `jesseλ31f2f5b`, `philipλ9411372`, and `lexi.lambdaλ81999a1` Screenshot of the `basedir` package page showing `williamλ74e41ee` and `willghatchλ553d284`

cloudrac3r commented 1 year ago

This looks like a good change! The lambda separator character is cute.

A natural extension would be to add a simple boolean preference allowing users to choose whether to obfuscate their email addresses or not.

I think it's better not to bring the email addresses back. When I published my socks5 package under an account with a brand new email address, that email has already started receiving spam. This is quite odd that it's been picked up by spambots from only being displayed on the packages site. Other email addresses that I've posted on the internet haven't met this fate.

A good long term solution would be to only display email addresses to other logged in users, like how GitHub does it.

It's also worth noting that people can be reached through the @author field in the Scribble documentation of their package.

LiberalArtist commented 1 year ago

I've added some notes on the search portion below.

I've figured out how to put the pieces together for search, and it seems to work well!

The display name implementation has moved to https://github.com/racket/infrastructure-userdb/pull/1, so that will need to be merged before this. Ideally https://github.com/racket/pkg-index/pull/47, which adds the tags during indexing, would also land before or concurrently with this PR.

LiberalArtist commented 1 year ago

I've pushed a slight tweak moving display-name->xexpr back to this repo, as I explained in https://github.com/racket/infrastructure-userdb/pull/1#discussion_r1240899257.

jryans commented 1 year ago

I have verified this is working well locally. I'll begin deploying this momentarily.

jryans commented 1 year ago

Deployment complete, looks like everything's working well! 🎉

Here's an example package that's already updated: https://pkgs.racket-lang.org/package/linear-regression All of the packages will gradually update across the next few hours or so.

Thanks again @LiberalArtist for your work on this! 😄