shadow-maint / shadow

Upstream shadow tree
Other
307 stars 237 forks source link

should protect against UTF-8 homographs #1138

Open Zugschlus opened 21 hours ago

Zugschlus commented 21 hours ago

Hi,

it looks like most Linux distributions today "blindly" allow creating user names that contain unicode code points. useradd in this case writes a correctly encoded /etc/passwd in UTF-8 encoding.

I am not sure whether this is a good thing to do before at least doing some checks regarding unicode normalization. For example, useradd will happily create two users étienne and étienne (one with the e aigu character, the other with an e aigu composed of an accent aigu and an e). In my opinion, the second account should be rejected.

Recommended reading about ways to handle this kind of stuff are the PRECIS RFCs, 8264 and 8265 (8266 is not THIS important), and the unicode TR 15 document (https://unicode.org/reports/tr15/) about normalization forms of unicode strings.

I do fully understand if you don't want to unlock this can of worms (I had the same issue with Debian's adduser), but please consider restricting the character set to disallow this otherwise, or at least writing documentation on this matter.

Greetings Marc

rbalint commented 20 hours ago

Hi,

Debian-devel thread starts here: https://lists.debian.org/debian-devel/2024/11/msg00250.html

Debian carries a patch that relaxes checks: https://lists.debian.org/debian-devel/2024/12/msg00045.html

I probably should have dropped the patch while maintaining shadow in Debian, but could not find time to follow up all the potential breakages before doing so. :-(

Thanks @Zugschlus for raising the topic and working on that.

I think the best way out would be dropping the carried useradd patch (https://salsa.debian.org/debian/shadow/-/commit/08e5e0a148b548a3eb2f5ba7acfd6ab406533268) in Debian and making the required changes in adduser, too.

Zugschlus commented 22 minutes ago

Before I filed this upstream issue, I tried creating märc, Étienne and mΩrc on Centos Stream 9, Alma Linux 9 and Fedora 41 (spun up VMs on Hetzner Cloud) using their adduser variant (needed --badnames) and the creation went through like it does on Debian. Hence I filed this upstream since this is an issue that other (all?) Linux distributions suffer from.

Their adduser man page symlinks to useradd, does this also apply to the binary?