Open Nurmagoz opened 2 years ago
The original attack research: https://trojansource.codes/
https://semgrep.dev/r?q=generic.unicode.security.bidi.contains-bidirectional-characters
$ semgrep --config="r/generic.unicode.security.bidi.contains-bidirectional-characters"
Fetching rules from https://semgrep.dev/registry.
Scanning 1075 files.
100%|██████████████████████████████████████████████████████████████████████████████████████████|1075/1075 tasks
Some files were skipped or only partially analyzed.
Scan was limited to files tracked by git.
Scan skipped: 2 files larger than 1.0 MB, 50 files matching .semgrepignore patterns
For a full list of skipped files, run semgrep with the --verbose flag.
(need more rules? `semgrep login` for additional free Semgrep Registry rules)
Ran 1 rule on 1075 files: 0 findings.
Also if you look into trojan source most editors and even github web ui has been updated to mitigate against this
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
https://github.com/nickboucher/trojan-source/blob/main/Rust/commenting-out.rs
we could run CI checking this but we have much bigger threats IMO.
e.g. AFAIK we have quite a lot of people with permissions to at-least merge their own PRs, create branches on actual nixos/nixpkgs, and I don't think we have branch protection on master considering there's this issue #118661 (most of the recent pings are false positives)
It'd be interesting if someone could replicate a trojan source style attack with nix since the language might just die when using unescaped bidirectional characters, or even using them properly :shrug:
actually semgrep doesn't support nix lang yet but https://github.com/haveyoudebuggedit/trojansourcedetector is more general
.trojansourcedetector.json
{
"exclude": [
".git/**"
],
"detect_unicode": false,
"detect_bidi": true
}
{"name":"bidirectional control character","file":"pkgs/applications/emulators/yuzu/yuzu-free-icons.patch","line":482,"column":113,"details":"Arabic Letter Mark control character"}
{"name":"bidirectional control character","file":"pkgs/development/libraries/glibc/2.34-master.patch.gz","line":25,"column":351,"details":"Arabic Letter Mark control character"}
{"name":"bidirectional control character","file":"pkgs/development/libraries/glibc/2.34-master.patch.gz","line":176,"column":381,"details":"Arabic Letter Mark control character"}
{"name":"bidirectional control character","file":"pkgs/development/libraries/glibc/2.34-master.patch.gz","line":549,"column":14,"details":"Arabic Letter Mark control character"}
{"name":"bidirectional control character","file":"pkgs/development/libraries/glibc/2.34-master.patch.gz","line":771,"column":199,"details":"Arabic Letter Mark control character"}
{"name":"bidirectional control character","file":"pkgs/development/libraries/glibc/2.34-master.patch.gz","line":855,"column":1337,"details":"Arabic Letter Mark control character"}
the yuzu patch has a bunch of png data so that's likely fine. glibc patch.gz is probably also fine
we could run CI checking this but we have much bigger threats IMO.
Yes like people throwing their secrets world readable into the nix store or source code downloads which are only hosted on some peoples server without version control, hashes or https where an attack could just release a new version with lots of code changes and a backdoor. Also we would need to maintain an allow list because we have at least 6 false positives already.
It would also lack a big part of our code because we fetch most patches from remote with TOFU hashes.
It'd be interesting if someone could replicate a trojan source style attack with nix since the language might just die when using unescaped bidirectional characters, or even using them properly 🤷
https://github.com/NixOS/nix/issues/5511
I am inclined to just close it as dupe of the linked issue.
Running find_unicode_control2.py from RHSB-2021-007 on a checkout of 4a1f5a1a7da8bcf205f4f691408f8cc0bd906fee found a couple of control characters that seem harmless.
./pkgs/build-support/dotnet/dotnetenv/Wrapper/Wrapper/Properties/AssemblyInfo.cs: unicode control characters: {'\ufeff'}
./pkgs/build-support/dotnet/dotnetenv/Wrapper/Wrapper/Wrapper.csproj.in: unicode control characters: {'\ufeff'}
./pkgs/build-support/dotnet/dotnetenv/Wrapper/Wrapper.sln: unicode control characters: {'\ufeff'}
./pkgs/tools/graphics/diagrams-builder/default.nix: unicode control characters: {'\xad'}
./pkgs/development/libraries/libcaca/default.nix: unicode control characters: {'\u200b'}
./pkgs/desktops/gnome/extensions/extensions.json: unicode control characters: {'\u200c'}
I think it would be hard to exploit this in nixpkgs since PRs and diffs are reviewed on GitHub, that warns the reviewer: https://github.blog/changelog/2021-10-31-warning-about-bidirectional-unicode-text/
Does NixOS scan their code/patches for this vulnerability?
Details:
Source: https://tech.michaelaltfield.net/2021/11/22/bidi-unicode-github-defense/
Archive: https://web.archive.org/web/20220101033034/https://tech.michaelaltfield.net/2021/11/22/bidi-unicode-github-defense/
Implemented by e.g:
https://access.redhat.com/security/vulnerabilities/RHSB-2021-007
https://www.kicksecure.com/wiki/Unicode
https://forums.whonix.org/t/detecting-malicious-unicode-in-source-code-and-pull-requests/13754
ThX!