indygreg / PyOxidizer

A modern Python application packaging and distribution tool
Mozilla Public License 2.0
5.4k stars 234 forks source link

thread 'main' panicked at 'already borrowed: BorrowMutError' #673

Open kkrampa opened 1 year ago

kkrampa commented 1 year ago

I tried to build a statically linked binary. The build works fine but it's not possible to execute the binary. I did a simple test by adding this line to your existing workflow. It fails with

thread 'main' panicked at 'already borrowed: BorrowMutError', /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.17.3/src/gil.rs:433:52

You can see the full build here: https://github.com/kkrampa/PyOxidizer/actions/runs/3925221913/jobs/6710342865

indygreg commented 1 year ago

I cannot reproduce this locally. But it reproduces in CI, which is concerning.

In my Ubuntu 22.10 environment I do get an error due to a missing symbol used by jemalloc:

  = note: ld.lld: error: undefined symbol: pthread_getname_np
          >>> referenced by prof_sys.c:308 (src/prof_sys.c:308)
          >>>               prof_sys.pic.o:(prof_sys_thread_name_read_impl) in archive /home/gps/src/PyOxidizer/target/debug/tempdir/pyoxidizerlsgyEI/build/target/x86_64-unknown-linux-musl/debug/deps/libjemalloc_sys-feab2aa01bde5692.rlib
          >>> did you mean: pthread_setname_np
          >>> defined in: /home/gps/.cache/pyoxidizer/rust/1.66.0-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/libc.a(pthread_setname_np.lo)
          collect2: error: ld returned 1 exit status

error: could not compile `myapp` due to previous error

If I edit pyoxidizer.bzl to uncomment the python_config.allocator_backend = "default" line, the build works and the binary doesn't crash at startup.

I'm skeptical jemalloc is related to the BorrowMutError error though, as this error seemingly is due to a logic bug somewhere in the runtime Rust code.

Lemme try to coerce a backtrace out of CI to isolate this.

Thanks for the bug report!

indygreg commented 1 year ago
thread 'main' panicked at 'already borrowed: BorrowMutError', /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.17.3/src/gil.rs:433:52
stack backtrace:
   0: rust_begin_unwind
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:65:14
   2: core::result::unwrap_failed
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/result.rs:1791:5
   3: core::result::Result<T,E>::expect
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/result.rs:1070:23
   4: core::cell::RefCell<T>::borrow_mut
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/cell.rs:958:9
   5: pyo3::gil::register_owned::{{closure}}
             at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.17.3/src/gil.rs:433:45
   6: std::thread::local::LocalKey<T>::try_with
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/thread/local.rs:446:16
   7: pyo3::gil::register_owned
             at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.17.3/src/gil.rs:433:13
   8: <T as pyo3::conversion::FromPyPointer>::from_owned_ptr_or_opt
             at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.17.3/src/conversion.rs:590:9
   9: pyo3::conversion::FromPyPointer::from_owned_ptr_or_panic
             at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.17.3/src/conversion.rs:531:9
  10: pyo3::conversion::FromPyPointer::from_owned_ptr
             at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.17.3/src/conversion.rs:539:9
  11: pyo3::marker::Python::from_owned_ptr
             at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.17.3/src/marker.rs:703:9
  12: pyo3::types::string::PyString::new
             at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.17.3/src/types/string.rs:144:18
  13: pyo3::types::string::<impl pyo3::conversion::IntoPy<pyo3::instance::Py<pyo3::types::string::PyString>> for &str>::into_py
             at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.17.3/src/types/string.rs:304:9
  14: pyo3::types::module::PyModule::import
             at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.17.3/src/types/module.rs:72:34
  15: pyo3::marker::Python::import
             at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.17.3/src/marker.rs:611:9
  16: pyembed::interpreter::MainPythonInterpreter::inject_oxidized_importer
             at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/pyembed-0.24.0/src/interpreter.rs:271:33
  17: pyembed::interpreter::MainPythonInterpreter::init::{{closure}}
             at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/pyembed-0.24.0/src/interpreter.rs:224:54
  18: pyo3::marker::Python::with_gil_unchecked
             at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.17.3/src/marker.rs:351:9
  19: pyembed::interpreter::MainPythonInterpreter::init
             at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/pyembed-0.24.0/src/interpreter.rs:224:22
  20: pyembed::interpreter::MainPythonInterpreter::new
             at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/pyembed-0.24.0/src/interpreter.rs:125:9
  21: myapp::main
             at /tmp/pyoxidizerqQBgH5/myapp/src/main.rs:46:15
  22: core::ops::function::FnOnce::call_once
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/ops/function.rs:251:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
indygreg commented 1 year ago

OK, this is a quirk / possible bug with PyO3.

This is choking on the holder.borrow_mut() call in the following code in gil.rs.

pub unsafe fn register_owned(_py: Python<'_>, obj: NonNull<ffi::PyObject>) {
    debug_assert!(gil_is_acquired());
    // Ignores the error in case this function called from `atexit`.
    let _ = OWNED_OBJECTS.try_with(|holder| holder.borrow_mut().push(obj));
}

The reason we get in this predicament likely has to do with our use of multi-phase interpreter initialization and Python::with_gil_unchecked() (an API I had to add to PyO3 to support multi-phase initialization).

What I don't understand here is why I can't reproduce this failure. This code should be single threaded and deterministic. I'm not sure why it is reproducing in CI / reporter's machine and not locally.

fgimian commented 1 year ago

I'm experiencing a similar issue which is reproducible directly on a Ubuntu 22.04 system. However, I can confirm that the problem only occurs with PyOxidizer 0.24.0. Both 0.23.0 and 0.22.0 work perfectly.

The project is sadly not open source but I may be able to put together a sample project that replicates the problem if it helps.

Here's my output:

fots in 🌐 ubuntu in ~/dot-ssh-config-generator on  main is 📦 v0.0.0 via 🐍 v3.10.6
🕙 [ 10:52:45 AM ] ❯ ./build/x86_64-unknown-linux-musl/release/install/dot-ssh-generator
thread 'main' panicked at 'already borrowed: BorrowMutError', /home/fots/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.17.3/src/gil.rs:433:45
stack backtrace:
   0: rust_begin_unwind
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:65:14
   2: core::result::unwrap_failed
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/result.rs:1791:5
   3: pyo3::gil::register_owned
   4: pyo3::types::string::PyString::new
   5: pyo3::marker::Python::with_gil_unchecked
   6: pyembed::interpreter::MainPythonInterpreter::new
   7: dot_ssh_generator::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Here's my PyOxidizer config:

def make_exe():
    dist = default_python_distribution(python_version = "3.10")

    config = dist.make_python_interpreter_config()
    config.run_module = "dot_ssh_generator"

    exe = dist.to_python_executable(name="dot-ssh-generator", config=config)
    exe.add_python_resources(exe.pip_install(["."]))

    return exe

def make_embedded_resources(exe):
    return exe.to_embedded_resources()

def make_install(exe):
    files = FileManifest()
    files.add_python_resource(".", exe)
    return files

register_target("exe", make_exe)
register_target("resources", make_embedded_resources, depends=["exe"], default_build_script=True)
register_target("install", make_install, depends=["exe"], default=True)

resolve_targets()

Thanks heaps Fotis

danielbraun89 commented 1 year ago

@indygreg With pyoxidizer 0.24.0 this problem reproduced for me on both ec2 ami (ubuntu18.04), and on github codespaces (ubuntu 22.04)

However it only happened for me under target x86_64-unknown-linux-musl and not for other targets.

Maybe this information will assist in pinpointing the problem

Can confirm downgrading pyoxidizer to 0.23.0 solved the issue. Thanks @fgimian for the hint

matthijs-oosterhoff commented 1 year ago

I'm running into this issue too. Can confirm it only happens when using the x86_64-unknown-linux-musl target.

Downgrading to pyoxidizer 0.23.0 doesn't work for me, because I get other errors on that version (and earlier versions). For example when compiling an empty project with the default config from pyoxidizer init-config-file:

matthijs@ws52:~/Desktop/fooapp$ pyoxidizer build
(...)
running: "ar" "s" "/tmp/pyoxidizer-libpythonPbs7W4/config_c/libirrelevant.a"
exit status: 0
resolving inputs for custom Python library...
linking customized Python library...
error[PYOXIDIZER_PYTHON_EXECUTABLE]: adding PythonExecutable to FileManifest

    Caused by:
        0: building Python executable
        1: building executable with Rust project
        2: obtaining embedded python context
        3: No such file or directory (os error 2)
       --> ./pyoxidizer.bzl:283:5
        |
    283 |     files.add_python_resource(".", exe)
        |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ PythonExecutable.to_file_manifest()

error: adding PythonExecutable to FileManifest

Caused by:
    0: building Python executable
    1: building executable with Rust project
    2: obtaining embedded python context
    3: No such file or directory (os error 2)
dotzborro commented 4 months ago

I hit this problem today on multiple different hosts.

@indygreg If you don't mind setting up Vagrant with libvirt on your box, here are steps that allow me to reproduce it:

  1. Go to some temporary directory.
  2. Create Vagrantfile with contents:
    
    $systemScript = <<-'SCRIPT'
    pacman -Syy --noconfirm extra/cloud-guest-utils
    growpart /dev/vda 3
    btrfs filesystem resize max /
    pacman -S --noconfirm rustup gcc musl libxcrypt libxcrypt-compat
    date > /etc/vagrant_provisioned_at
    SCRIPT

$userScript = <<-'SCRIPT' rustup install stable rustup default stable cargo install pyoxidizer echo "export PATH=$PATH:$HOME/.cargo/bin" >> ~/.bashrc SCRIPT

Vagrant.configure("2") do |config| config.vm.box = "archlinux/archlinux" config.vm.provider :libvirt do |domain| domain.memory = 4096 domain.cpus = 4 domain.machine_virtual_size = 80 end config.vm.provision "shell", inline: $systemScript config.vm.provision "shell", inline: $userScript, privileged: false end

3. `vagrant up`, wait till it provisions everything :crossed_fingers: 
4. `vagrant ssh`
5. run following commands
```shell
pyoxidizer init-config-file app
cd app/
sed -i '/python_config.allocator_backend = "jemalloc"/a\ \ \ \ python_config.allocator_backend = "default"' pyoxidizer.bzl
pyoxidizer build --release --target-triple x86_64-unknown-linux-musl
./build/x86_64-unknown-linux-musl/release/install/app
<CRASH>