PyCQA / bandit

Bandit is a tool designed to find common security issues in Python code.
https://bandit.readthedocs.io
Apache License 2.0
6.52k stars 612 forks source link

Non-utf8 character causes crash when scanning #882

Closed EstevamArantes closed 10 months ago

EstevamArantes commented 2 years ago

Describe the bug

Bandit fails and crashes (skipping file) when trying to decode/parse character that isn't utf-8.

xxd file that causes bug:

➜  xxd poc.py
00000000: 2320 5255 4e3a 2074 7275 650a 0a23 2048  # RUN: true..# H
00000010: 6572 6520 6973 2061 2073 7472 696e 6720  ere is a string
00000020: 7468 6174 2063 616e 6e6f 7420 6265 2064  that cannot be d
00000030: 6563 6f64 6564 2069 6e20 6c69 6e65 206d  ecoded in line m
00000040: 6f64 653a 20c2 2e0a                      ode: ...

Execute bandit --debug pocFile.py

[main]  DEBUG   logging initialized
[main]  INFO    profile include tests: None
[main]  INFO    profile exclude tests: None
[main]  INFO    cli include tests: None
[main]  INFO    cli exclude tests: None
[test_set]      DEBUG   added function any_other_function_with_shell_equals_true (B604) targeting Call
[test_set]      DEBUG   added function assert_used (B101) targeting Assert
[test_set]      DEBUG   added function django_extra_used (B610) targeting Call
[test_set]      DEBUG   added function django_mark_safe (B703) targeting Call
[test_set]      DEBUG   added function django_rawsql_used (B611) targeting Call
[test_set]      DEBUG   added function exec_used (B102) targeting Call
[test_set]      DEBUG   added function flask_debug_true (B201) targeting Call
[test_set]      DEBUG   added function hardcoded_bind_all_interfaces (B104) targeting Str
[test_set]      DEBUG   added function hardcoded_password_default (B107) targeting FunctionDef
[test_set]      DEBUG   added function hardcoded_password_funcarg (B106) targeting Call
[test_set]      DEBUG   added function hardcoded_password_string (B105) targeting Str
[test_set]      DEBUG   added function hardcoded_sql_expressions (B608) targeting Str
[test_set]      DEBUG   added function hardcoded_tmp_directory (B108) targeting Str
[test_set]      DEBUG   added function hashlib_insecure_functions (B324) targeting Call
[test_set]      DEBUG   added function jinja2_autoescape_false (B701) targeting Call
[test_set]      DEBUG   added function linux_commands_wildcard_injection (B609) targeting Call
[test_set]      DEBUG   added function logging_config_insecure_listen (B612) targeting Call
[test_set]      DEBUG   added function paramiko_calls (B601) targeting Call
[test_set]      DEBUG   added function request_with_no_cert_validation (B501) targeting Call
[test_set]      DEBUG   added function request_without_timeout (B113) targeting Call
[test_set]      DEBUG   added function set_bad_file_permissions (B103) targeting Call
[test_set]      DEBUG   added function snmp_insecure_version (B508) targeting Call
[test_set]      DEBUG   added function snmp_weak_cryptography (B509) targeting Call
[test_set]      DEBUG   added function ssh_no_host_key_verification (B507) targeting Call
[test_set]      DEBUG   added function ssl_with_bad_defaults (B503) targeting FunctionDef
[test_set]      DEBUG   added function ssl_with_bad_version (B502) targeting Call
[test_set]      DEBUG   added function ssl_with_no_version (B504) targeting Call
[test_set]      DEBUG   added function start_process_with_a_shell (B605) targeting Call
[test_set]      DEBUG   added function start_process_with_no_shell (B606) targeting Call
[test_set]      DEBUG   added function start_process_with_partial_path (B607) targeting Call
[test_set]      DEBUG   added function subprocess_popen_with_shell_equals_true (B602) targeting Call
[test_set]      DEBUG   added function subprocess_without_shell_equals_true (B603) targeting Call
[test_set]      DEBUG   added function try_except_continue (B112) targeting ExceptHandler
[test_set]      DEBUG   added function try_except_pass (B110) targeting ExceptHandler
[test_set]      DEBUG   added function use_of_mako_templates (B702) targeting Call
[test_set]      DEBUG   added function weak_cryptographic_key (B505) targeting Call
[test_set]      DEBUG   added function yaml_load (B506) targeting Call
[test_set]      DEBUG   added function blacklist (B001) targeting Call
[test_set]      DEBUG   added function blacklist (B001) targeting Import
[test_set]      DEBUG   added function blacklist (B001) targeting ImportFrom
[main]  INFO    running on Python 3.10.2
[manager]       DEBUG   working on file : poc.py
[manager]       ERROR   Exception occurred when executing tests against poc.py. Run "bandit --debug poc.py" to see the full traceback.
[manager]       DEBUG     Exception string: 'utf-8' codec can't decode byte 0xc2 in position 56: invalid continuation byte
[manager]       DEBUG     Exception traceback: Traceback (most recent call last):
[main]  DEBUG   Length: 0

[main]  DEBUG   <bandit.core.metrics.Metrics object at 0x7f40a1ff26e0>
Run started:2022-04-12 17:55:26.123213

Test results:
        No issues identified.

Code scanned:
        Total lines of code: 0
        Total lines skipped (#nosec): 0

Run metrics:
        Total issues (by severity):
                Undefined: 0
                Low: 0
                Medium: 0
                High: 0
        Total issues (by confidence):
                Undefined: 0
                Low: 0
                Medium: 0
                High: 0
Files skipped (1):
        poc.py (exception while scanning file)

Reproduction steps

1. Copy xxd of file and use xxd -r to decode it into a .py file

poc.txt

00000000: 2320 5255 4e3a 2074 7275 650a 0a23 2048  # RUN: true..# H
00000010: 6572 6520 6973 2061 2073 7472 696e 6720  ere is a string
00000020: 7468 6174 2063 616e 6e6f 7420 6265 2064  that cannot be d
00000030: 6563 6f64 6564 2069 6e20 6c69 6e65 206d  ecoded in line m
00000040: 6f64 653a 20c2 2e0a                      ode: ...
  1. xxd -r poc.txt > pocFile.py
  2. Execute bandit --debug pocFile.py
  3. Crash ...

Expected behavior

Bandit executes as usual and doesn't crash.

Bandit version

1.7.4 (Default)

Python version

3.10 (Default)

Additional context

Bandit 1.7.5, just cloned from main today.

ericwb commented 2 years ago

So you will get the same result if you run: python pocFile.py

However, if a Python file contains UTF-8 characters, then it must be specified in the header: # -*- coding: utf-8 -*-

That will fix the case using python, but unfortunately Bandit still fails.

mportesdev commented 2 years ago

@ericwb As of Python 3, utf-8 is the default encoding of source code, and doesn't have to be declared even if the source code contains non-ascii characters. However the example above involves a non-utf-8 encoded character.

@EstevamArantes What you have there is the  character encoded in latin_1 (aka iso-8859-1). This encoding must be declared at the beginning of the file.

https://docs.python.org/3/reference/lexical_analysis.html#encoding-declarations

That said, I think this is not a bandit issue and can be closed.

ericwb commented 10 months ago

Agree with @mportesdev here. Encoding should be declared in header if not utf-8.