Python: False positive caused by impossible `isinstance` check

DefinetlyNotAI commented 1 month ago

Description of the false positive

Sometimes when a variable either stores a tuple containing a password, or a error message in a string with a if statement to check what to do, CodeQL then ignores the variable type and its checks and goes in a impossible route... where somehow the password that can only be stored as a tuple, is then written to a log as a string... thus a false positive

In the following example, the path shows how temp is returned, but the if statements show that it only occurs when its a string, and when temp is a string it contains a error message not a password, as the password is saved in temp as a tuple only if there was no errors, my code is spaghetti so i do apologize


Clear-text storage of sensitive information
Step 1 ControlFlowNode for Subscript
Source
DataBase.py:756

        api = config["api"]
        username = config["username"]
        password = config["password"]
        exclusion_titles = config["exclusion_titles"]
        return api, username, password, exclusion_titles
    except Exception as e:
Step 2 ControlFlowNode for password
[DataBase.py:756](https://github.com/DefinetlyNotAI/Test-generator/blob/45ff5ccc3f4477d4733dc1e27969ca0974abeaca/DataBase.py#L756-L756)

        api = config["api"]
        username = config["username"]
        password = config["password"]
        exclusion_titles = config["exclusion_titles"]
        return api, username, password, exclusion_titles
    except Exception as e:
Step 3 ControlFlowNode for Tuple
DataBase.py:758
        username = config["username"]
        password = config["password"]
        exclusion_titles = config["exclusion_titles"]
        return api, username, password, exclusion_titles
    except Exception as e:
        return f"ERROR {e} && 520"

Step 4 ControlFlowNode for read_api()
DataBase.py:864

        def init():
            # Initialize the UserManager and API values
            temp = read_api()
            if isinstance(temp, str):
                if check_for_LIST(temp):
                    return temp
Step 5 ControlFlowNode for temp
[DataBase.py:864](https://github.com/DefinetlyNotAI/Test-generator/blob/45ff5ccc3f4477d4733dc1e27969ca0974abeaca/DataBase.py#L864-L864)

        def init():
            # Initialize the UserManager and API values
            temp = read_api()
            if isinstance(temp, str):
                if check_for_LIST(temp):
                    return temp
Step 6 ControlFlowNode for temp
DataBase.py:867
            temp = read_api()
            if isinstance(temp, str):
                if check_for_LIST(temp):
                    return temp
            else:
                api, username, password, exclusion_titles = temp

Step 7 ControlFlowNode for init()
[DataBase.py:908](https://github.com/DefinetlyNotAI/Test-generator/blob/45ff5ccc3f4477d4733dc1e27969ca0974abeaca/DataBase.py#L908-L908)
            return DATA

        # Main startup
        return init()

    except Exception as e:
        return f"ERROR {e} && 520"
Step 8 ControlFlowNode for database_thread()
[flask_server.py:195](https://github.com/DefinetlyNotAI/Test-generator/blob/45ff5ccc3f4477d4733dc1e27969ca0974abeaca/flask_server.py#L195-L195)
            and os.path.exists("Test.csv")
        ):
            # Return an HTML success message
            message = database_thread()
            if message.startswith("ERROR"):
                # Removing 'ERROR' from the beginning of the message
                message = message.replace("ERROR", "", 1)
Step 9 ControlFlowNode for message
[flask_server.py:195](https://github.com/DefinetlyNotAI/Test-generator/blob/45ff5ccc3f4477d4733dc1e27969ca0974abeaca/flask_server.py#L195-L195)
            and os.path.exists("Test.csv")
        ):
            # Return an HTML success message
            message = database_thread()
            if message.startswith("ERROR"):
                # Removing 'ERROR' from the beginning of the message
                message = message.replace("ERROR", "", 1)
Step 10 ControlFlowNode for message
[flask_server.py:198](https://github.com/DefinetlyNotAI/Test-generator/blob/45ff5ccc3f4477d4733dc1e27969ca0974abeaca/flask_server.py#L198-L198)
            message = database_thread()
            if message.startswith("ERROR"):
                # Removing 'ERROR' from the beginning of the message
                message = message.replace("ERROR", "", 1)

                parts = message.split(" && ")

Step 11 ControlFlowNode for Fstring
[flask_server.py:209](https://github.com/DefinetlyNotAI/Test-generator/blob/45ff5ccc3f4477d4733dc1e27969ca0974abeaca/flask_server.py#L209-L209)
                    error_number = int(parts[1]) if parts[1].isdigit() else None
                else:
                    logger.error(
                        f"Invalid message format: {message} with {len(parts)} parts."
                    )
                    return "The message does not match the expected format.", 400

Step 12 ControlFlowNode for message
[flask_server.py:41](https://github.com/DefinetlyNotAI/Test-generator/blob/45ff5ccc3f4477d4733dc1e27969ca0974abeaca/flask_server.py#L41-L41)

        return time

    def error(self, message):
        """
        Writes an error message to the log file.
Step 13 ControlFlowNode for Fstring
Sink
[flask_server.py:52](https://github.com/DefinetlyNotAI/Test-generator/blob/45ff5ccc3f4477d4733dc1e27969ca0974abeaca/flask_server.py#L52-L52)
            None
        """
        with open(self.filename, "a") as f:
            f.write(f"ERROR: {message} at {self.timestamp()}\n")
This expression stores  as clear text.

    def info(self, message):
        """

Code samples or links to source code

As shown above

URL to the alert on GitHub code scanning (optional)

https://github.com/DefinetlyNotAI/Test-generator/security/code-scanning/42

Many more, but you get the idea

smowton commented 1 month ago

Thanks for the report! False positives like these aren't something we're able to prioritise at the moment, but the team have assessed it and we will keep it in mind for future work.

DefinetlyNotAI commented 1 month ago

I dont know if my idea has been reached or not, but for clarification, if the function plays with 2 seperate variable types with a single variable, it would take the worst case scenario (which is usually impossible) and then flag it.

github / codeql

Python: False positive caused by impossible `isinstance` check #16912