Open ThisIsMissEm opened 1 year ago
I'm not sure I understand the issue enough to know what to do about it. Could you please elaborate a little?
Is the risk that someone might think they're blocking a domain, but aren't? Or maybe block something else that looks similar but isn't the same?
And what behaviour should be expected? Should we add punycode normalisation so, no matter what gets imported, fediblockhole always operates on punycode normalised domains for its comparisons and upserts into instances?
Sorry to be dense. Just want to make sure I appreciate the issue properly.
(Reporting this publicly is fine. I'll have another look at setting up GitHub's security thing.)
Yeah, I think normalisation using punycode would probably be a good idea, that way you're always comparing correctly. The risk is mostly in potential mismatches between the blocklist and the instance, so yeah, someone things they're blocking a bad instance but they're actually not.
If I understand this issue correctly, the risk is:
The remedy would be to normalise with punycode somehow. That will make it easier to detect the attempt at misleading people.
Where should this normalisation occur?
Options include:
I invite comment on which approach we should take, and encourage example implementations and PRs.
I'd be inclined to inspect the block list, and if any domain in it when punycode encoded doesn't match the entry's domain, then fail the import. i.e., force all domains to be punycode encoded in blocklists.
For example,
mastоdon.social
isn'tmastodon.social
(the official instance), first domain is with a lookalike character for the firsto
inmastodon.social
, so in punycode would bexn--mastdon-djg.social
which is clearly different.When Mastodon returns domain blocks from the API, they are normalised to punycode, so the API, despite accepting lookalike characters will result in them appearing as punycode in the response.
I had a look through the code, and from what I can tell there is no code for handling domain punycode normalisation, which may cause unexpected results with this tool if a source blocklist does not do punycode normalisation.
Note: As this project has neither a SECURITY.md file, nor the GitHub Security features enabled, I was not able to disclose this potential issue in a more responsible disclosure manner, without seeking out contributor email addresses (typically a privacy violation).