NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
18.06k stars 14.1k forks source link

Detecting Malicious Unicode in Source Code and Pull Requests #182890

Open Nurmagoz opened 2 years ago

Nurmagoz commented 2 years ago

Does NixOS scan their code/patches for this vulnerability?

Details:

Source: https://tech.michaelaltfield.net/2021/11/22/bidi-unicode-github-defense/

Archive: https://web.archive.org/web/20220101033034/https://tech.michaelaltfield.net/2021/11/22/bidi-unicode-github-defense/

Implemented by e.g:

https://access.redhat.com/security/vulnerabilities/RHSB-2021-007

https://www.kicksecure.com/wiki/Unicode

https://forums.whonix.org/t/detecting-malicious-unicode-in-source-code-and-pull-requests/13754

ThX!

adrelanos commented 2 years ago

The original attack research: https://trojansource.codes/

06kellyjac commented 2 years ago

https://semgrep.dev/r?q=generic.unicode.security.bidi.contains-bidirectional-characters

$ semgrep --config="r/generic.unicode.security.bidi.contains-bidirectional-characters"
Fetching rules from https://semgrep.dev/registry.
Scanning 1075 files.
  100%|██████████████████████████████████████████████████████████████████████████████████████████|1075/1075 tasks

Some files were skipped or only partially analyzed.
  Scan was limited to files tracked by git.
  Scan skipped: 2 files larger than 1.0 MB, 50 files matching .semgrepignore patterns
  For a full list of skipped files, run semgrep with the --verbose flag.

(need more rules? `semgrep login` for additional free Semgrep Registry rules)

Ran 1 rule on 1075 files: 0 findings.

Also if you look into trojan source most editors and even github web ui has been updated to mitigate against this

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters

https://github.com/nickboucher/trojan-source/blob/main/Rust/commenting-out.rs

we could run CI checking this but we have much bigger threats IMO.

e.g. AFAIK we have quite a lot of people with permissions to at-least merge their own PRs, create branches on actual nixos/nixpkgs, and I don't think we have branch protection on master considering there's this issue #118661 (most of the recent pings are false positives)

It'd be interesting if someone could replicate a trojan source style attack with nix since the language might just die when using unescaped bidirectional characters, or even using them properly :shrug:

06kellyjac commented 2 years ago

actually semgrep doesn't support nix lang yet but https://github.com/haveyoudebuggedit/trojansourcedetector is more general

.trojansourcedetector.json

{
  "exclude": [
    ".git/**"
  ],
  "detect_unicode": false,
  "detect_bidi": true
}
{"name":"bidirectional control character","file":"pkgs/applications/emulators/yuzu/yuzu-free-icons.patch","line":482,"column":113,"details":"Arabic Letter Mark control character"}
{"name":"bidirectional control character","file":"pkgs/development/libraries/glibc/2.34-master.patch.gz","line":25,"column":351,"details":"Arabic Letter Mark control character"}
{"name":"bidirectional control character","file":"pkgs/development/libraries/glibc/2.34-master.patch.gz","line":176,"column":381,"details":"Arabic Letter Mark control character"}
{"name":"bidirectional control character","file":"pkgs/development/libraries/glibc/2.34-master.patch.gz","line":549,"column":14,"details":"Arabic Letter Mark control character"}
{"name":"bidirectional control character","file":"pkgs/development/libraries/glibc/2.34-master.patch.gz","line":771,"column":199,"details":"Arabic Letter Mark control character"}
{"name":"bidirectional control character","file":"pkgs/development/libraries/glibc/2.34-master.patch.gz","line":855,"column":1337,"details":"Arabic Letter Mark control character"}

the yuzu patch has a bunch of png data so that's likely fine. glibc patch.gz is probably also fine

SuperSandro2000 commented 2 years ago

we could run CI checking this but we have much bigger threats IMO.

Yes like people throwing their secrets world readable into the nix store or source code downloads which are only hosted on some peoples server without version control, hashes or https where an attack could just release a new version with lots of code changes and a backdoor. Also we would need to maintain an allow list because we have at least 6 false positives already.

It would also lack a big part of our code because we fetch most patches from remote with TOFU hashes.

It'd be interesting if someone could replicate a trojan source style attack with nix since the language might just die when using unescaped bidirectional characters, or even using them properly 🤷

https://github.com/NixOS/nix/issues/5511

I am inclined to just close it as dupe of the linked issue.

stigtsp commented 2 years ago

Running find_unicode_control2.py from RHSB-2021-007 on a checkout of 4a1f5a1a7da8bcf205f4f691408f8cc0bd906fee found a couple of control characters that seem harmless.

./pkgs/build-support/dotnet/dotnetenv/Wrapper/Wrapper/Properties/AssemblyInfo.cs: unicode control characters: {'\ufeff'}
./pkgs/build-support/dotnet/dotnetenv/Wrapper/Wrapper/Wrapper.csproj.in: unicode control characters: {'\ufeff'}
./pkgs/build-support/dotnet/dotnetenv/Wrapper/Wrapper.sln: unicode control characters: {'\ufeff'}
./pkgs/tools/graphics/diagrams-builder/default.nix: unicode control characters: {'\xad'}
./pkgs/development/libraries/libcaca/default.nix: unicode control characters: {'\u200b'}
./pkgs/desktops/gnome/extensions/extensions.json: unicode control characters: {'\u200c'}

I think it would be hard to exploit this in nixpkgs since PRs and diffs are reviewed on GitHub, that warns the reviewer: https://github.blog/changelog/2021-10-31-warning-about-bidirectional-unicode-text/