Open hswong3i opened 11 months ago
I'm not familiar with how ceph is implemented, is there an obvious reason why it'd be initializing modules multiple times per process? Does it use subinterpreters?
From my point of view, case should as similar as https://github.com/pyca/cryptography/issues/9016#issuecomment-1589003489.
BTW now I could combine both Ceph 18.2.1 + python3-cryptography 41.0.7 + python3-bcrypt 4.0.1 without error; the error only happen when running with bcrypt >= 4.1.0.
Therefore I guess bcrypt should reference some handling from cryptography between 41.0.1..41.0.7 for preventing multiple initialization?
Hmm, so the problem requires using both cryptography 41.0.7 and bcrypt 4.1.1? This is very confusing, and I can't imagine what cross-package interaction would be.
On Tue, Dec 12, 2023 at 9:24 AM Wong Hoi Sing Edison < @.***> wrote:
From my point of view, case should as similar pyca/cryptography#9016 (comment) https://github.com/pyca/cryptography/issues/9016#issuecomment-1589003489 .
BTW now I could combine both Ceph 18.2.1 + python3-cryptography 41.0.7 + python3-bcrypt 4.0.1 without error; the error only happen when running with bcrypt >= 4.1.0.
Therefore I guess bcrypt should reference some handling from cryptography between 41.0.1..41.0.7 for preventing multiple initialization?
— Reply to this email directly, view it on GitHub https://github.com/pyca/bcrypt/issues/694#issuecomment-1852137080, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAAGBBWCEFUH7YKSX35JLLYJBSKDAVCNFSM6AAAAABAQ2VAX2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJSGEZTOMBYGA . You are receiving this because you commented.Message ID: @.***>
-- All that is necessary for evil to succeed is for good people to do nothing.
The case with Ceph 18.2.1 are:
Original report for Ceph 18.2.1 + cryptography 41.0.x could be found from:
BTW as above cross check shown, now cryptography no longer become the root cause, but coming from bcrypt ;-(
Is it straightforward for me to reproduce this locally, or do I need a full ceph installation?
I have a Vagrant Box for Ceph 18.2: https://app.vagrantup.com/alvistack/boxes/ceph-18.2
It is now rebuilding with my latest functional combination (i.e. cryptography 41.0.7 + bcrypt 4.0.1): https://gitlab.com/alvistack/vagrant-ceph/-/pipelines/1103864469
If you are using VirtualBox (see https://github.com/alvistack/vagrant-ceph#quick-start):
# Initialize Vagrant
cat > Vagrantfile <<-EOF
Vagrant.configure('2') do |config|
config.vm.hostname = 'ceph-18.2'
config.vm.box = 'alvistack/ceph-18.2'
config.vm.provider :virtualbox do |virtualbox|
config.vm.disk :disk, name: 'sdb', size: '10GB'
virtualbox.cpus = 2
virtualbox.customize ['modifyvm', :id, '--cpu-profile', 'host']
virtualbox.customize ['modifyvm', :id, '--nested-hw-virt', 'on']
virtualbox.memory = 8192
end
end
EOF
# Start the virtual machine
export VAGRANT_EXPERIMENTAL='1'
vagrant up
# SSH into this machine
vagrant ssh
Once the box is up and running:
vagrant ssh
then sudo su -
as rootceph -s
, now should showing HEALTH_OK
/usr/lib/python3/dist-packages/bcrypt
into 4.1.1ceph -s
, now it should show HEALTH_WARN 9 mgr modules have failed dependencies
journalctl -xef -u ceph-mgr@*
you should able to get the error log as show aboveIf I built a custom wheel, would you be able to test that out and see if you can reproduce?
If I built a custom wheel, would you be able to test that out and see if you can reproduce?
Or you may provide your forked GitHub branch, so I could keep rebuilding and testing with it in my local dev env?
https://github.com/pyca/bcrypt/pull/695 -- you can either download a wheel from the wheel-builder job, or you can build yourself from that branch.
@alex my quick check and report:
For the last case, the new error message is now:
Dec 13 06:41:15 node12 ceph-mgr[17264]: 2023-12-13T06:41:15.339+0000 7f3d7e793280 -1 mgr[py] Module status has missing NOTIFY_TYPES member
Dec 13 06:41:15 node12 ceph-mgr[17264]: 2023-12-13T06:41:15.479+0000 7f3d7e793280 -1 mgr[py] Module not found: 'mgr_module'
Dec 13 06:41:15 node12 ceph-mgr[17264]: 2023-12-13T06:41:15.479+0000 7f3d7e793280 -1 mgr[py] Traceback (most recent call last):
Dec 13 06:41:15 node12 ceph-mgr[17264]: File "/usr/share/ceph/mgr/mgr_module.py", line 28, in <module>
Dec 13 06:41:15 node12 ceph-mgr[17264]: from mgr_util import profile_method
Dec 13 06:41:15 node12 ceph-mgr[17264]: File "/usr/share/ceph/mgr/mgr_util.py", line 6, in <module>
Dec 13 06:41:15 node12 ceph-mgr[17264]: import bcrypt
Dec 13 06:41:15 node12 ceph-mgr[17264]: File "/lib/python3/dist-packages/bcrypt/__init__.py", line 13, in <module>
Dec 13 06:41:15 node12 ceph-mgr[17264]: from ._bcrypt import (
Dec 13 06:41:15 node12 ceph-mgr[17264]: ImportError: PyO3 modules do not yet support subinterpreters, see https://github.com/PyO3/pyo3/issues/576
Hmm, on the one hand, if ceph really is using subinterpreters, there's nothing we can do.
On the other hand, the fact that this error is showing up only with certain version combinations suggestions there's a deeper bug here. I'm afraid I don't have a suggestion other than that we really need a minimal reproducer here, something smaller than all of ceph.
Ceph really does use subinterpreters. It also used to work on my system (F39) after a patch to remove cryptography
as a dependency, but now it's failing again with a larger update.
My suspicion is the check for multiple instances itself is/was somehow flawed and doesn't always go off (or used not to), but Ceph has always been using subinterpretrs.
Ceph bug: https://tracker.ceph.com/issues/63529 Fedora bug: https://bugzilla.redhat.com/show_bug.cgi?id=2255688
Yes. I have, in effect, vendored a rebuild of python-bcrypt with:
diff --git a/src/_bcrypt/Cargo.toml b/src/_bcrypt/Cargo.toml
index a9c7f7c..02317c8 100644
--- a/src/_bcrypt/Cargo.toml
+++ b/src/_bcrypt/Cargo.toml
@@ -6,7 +6,7 @@ edition = "2018"
publish = false
[dependencies]
-pyo3 = { version = "0.20.0", features = ["abi3"] }
+pyo3 = { git = "https://git.st8l.com/luxolus/pyo3", tag = "v0.20.3-subint+1", features = ["abi3", "unsafe-allow-subinterpreters"] }
bcrypt = "0.15"
bcrypt-pbkdf = "0.10.0"
base64 = "0.21.5"
https://git.st8l.com/luxolus/pyo3/commit/338c71d0ad10f7ae38b7b44e576d49b91ed20d99
Which is possible because this python module doesn't store Py
objects in Rust static
s. The dashboard remains broken, because other packages still use pyO3 modules, but the mgr's core functionality is operational.
For reference, in case anyone's stumbling over this issue, adding sub-interpreter support in PyO3 is being worked on over here: https://github.com/PyO3/pyo3/issues/3451
The Ceph folks that commented here (hi again!) are already aware of this; just wanted to leave some breadcrumbs for others to follow in case anyone's interested in helping out as well.
Similar as https://github.com/pyca/cryptography/issues/9016, my ceph-mgr in 18.2.1 get error message once upgrade from bcrypt 4.0.1 to 4.1.1:
This was initially introduced by:
#[pymodule]
to be initialized once, when pyo3 >= 0.17.0My only workaround for now is, revert back to bcrypt == 4.0.1, which still depends on pyo3 = { version = "0.15.2" }