libgit2 / pygit2

Python bindings for libgit2
https://www.pygit2.org/
Other
1.58k stars 382 forks source link

Don't decref PyBytesObject before formatting an error with it #1299

Closed jorio closed 3 weeks ago

jorio commented 4 weeks ago

Some C functions call Py_XDECREF on a PyBytesObject (representing a string), and then they pass a pointer within that object to Error_set_str. This causes the error message to appear corrupted if garbage collection occurs at just the right time.

For example, init_file_backend does something like this:

PyBytesObject *py_path = ...;

const char* path = PyBytes_AS_STRING(py_path);

err = git_repository_open_ext(&repository, path, flags, NULL);

Py_XDECREF(py_path);  // <--- too early!

if (err < 0) {
    Error_set_str(err, path);  // <--- path may be gone due to XDECREF above
    goto cleanup;
}

The issue is that PyBytes_AsString returns a pointer within py_path – it doesn't copy the string. So, Py_XDECREF may cause py_path to be collected at any time, possibly garbling the value of path.

When my application invokes Repository.__init__ with a path that does not exist, such as "/tmp/submoroot/submosub", then pygit2 may ultimately raise this:

repository.py:1607, in __init__
    path_backend = init_file_backend(path, int(flags))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_pygit2.GitError: Repository not found at dtmp/subdtmp/subdtmp/subdtmp/su

Note that the path in the error message (dtmp/subdtmp/subdtmp/subdtmp/su) is garbled.

This PR fixes this by calling Py_XDECREF(py_path) after the error message is built. I'm not sure how to distill this into a unit test because the GC needs to kick in at just the right time to exhibit the faulty behavior.