StackStorm / st2

StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, incident responses, troubleshooting, deployments, and more for DevOps and SREs. Includes rules engine, workflow, 160 integration packs with 6000+ actions (see https://exchange.stackstorm.org) and ChatOps. Installer at https://docs.stackstorm.com/install/index.html
https://stackstorm.com/
Apache License 2.0
5.96k stars 741 forks source link

correct bcrypt to fix the known issue with 4.1.2 #6210

Open sravs-dev opened 1 month ago

sravs-dev commented 1 month ago

I was testing st23.9 dev on Rhel9, st2auth fails to startup with following error. Downgrading bcrypt to fix the known bug in 4.1.2

Screenshot 2024-05-31 at 6 53 24 PM
amanda11 commented 1 month ago

The fixed requirements file is input to the other requirements files, so you need to regenerate the top level and component requirements files. Please see comment on top-level requirements file.

nzlosh commented 3 weeks ago

Can you add a changelog entry for the PR please?

sravs-dev commented 2 weeks ago

@nzlosh I have updated the changelog. Could you pls take a look. Thank you!

sravs-dev commented 2 weeks ago

@nzlosh One of the CI test failed, looking at logs , it doesn't seem to be related to this change. I noticed that st2api and st2auth failed to start as the ports are not available . Here is the log from the test run

st2api.log [2024-06-10 16:45:57 +0000] [7639] [INFO] Starting gunicorn 22.0.0 [2024-06-10 16:45:57 +0000] [7639] [ERROR] Connection in use: ('0.0.0.0', 9101) [2024-06-10 16:45:57 +0000] [7639] [ERROR] Retrying in 1 second. [2024-06-10 16:45:58 +0000] [7639] [ERROR] Connection in use: ('0.0.0.0', 9101) [2024-06-10 16:45:58 +0000] [7639] [ERROR] Retrying in 1 second. [2024-06-10 16:45:59 +0000] [7639] [ERROR] Connection in use: ('0.0.0.0', 9101) [2024-06-10 16:45:59 +0000] [7639] [ERROR] Retrying in 1 second. [2024-06-10 16:46:00 +0000] [7639] [ERROR] Connection in use: ('0.0.0.0', 9101) [2024-06-10 16:46:00 +0000] [7639] [ERROR] Retrying in 1 second. [2024-06-10 16:46:01 +0000] [7639] [ERROR] Connection in use: ('0.0.0.0', 9101) [2024-06-10 16:46:01 +0000] [7639] [ERROR] Retrying in 1 second. [2024-06-10 16:46:02 +0000] [7639] [ERROR] Can't connect to ('0.0.0.0', 9101)

st2auth [2024-06-10 16:46:09 +0000] [7787] [INFO] Starting gunicorn 22.0.0 [2024-06-10 16:46:09 +0000] [7787] [ERROR] Connection in use: ('0.0.0.0', 9100) [2024-06-10 16:46:09 +0000] [7787] [ERROR] Retrying in 1 second. [2024-06-10 16:46:10 +0000] [7787] [ERROR] Connection in use: ('0.0.0.0', 9100) [2024-06-10 16:46:10 +0000] [7787] [ERROR] Retrying in 1 second. [2024-06-10 16:46:11 +0000] [7787] [ERROR] Connection in use: ('0.0.0.0', 9100) [2024-06-10 16:46:11 +0000] [7787] [ERROR] Retrying in 1 second. [2024-06-10 16:46:12 +0000] [7787] [ERROR] Connection in use: ('0.0.0.0', 9100) [2024-06-10 16:46:12 +0000] [7787] [ERROR] Retrying in 1 second. [2024-06-10 16:46:13 +0000] [7787] [ERROR] Connection in use: ('0.0.0.0', 9100) [2024-06-10 16:46:13 +0000] [7787] [ERROR] Retrying in 1 second. [2024-06-10 16:46:14 +0000] [7787] [ERROR] Can't connect to ('0.0.0.0', 9100)

Same CI check passed for the previous commit , before change log was added . Here is the older CI test result for Orquesta on py3.8 https://github.com/StackStorm/st2/actions/runs/9330762609/job/25684689576

Can someone retrigger the CI checks? Looks to be a transient issue.

cognifloyd commented 2 weeks ago

Looks good. I'm going to update the pantsbuild dependencies (the lockfile) here and then I'll merge.

cognifloyd commented 2 weeks ago

OK. Looking further into this: none of the code in this repo has a direct dependency on passlib or bcyrpt. The passlib dep comes from st2-auth-backend-flat-file: https://github.com/StackStorm/st2-auth-backend-flat-file/blob/master/requirements.txt

passlib>=1.7.1,<1.8.0

So, I think the passlib[bcrypt] extra should go in that file, not in this repo. WIP branch: https://github.com/StackStorm/st2-auth-backend-flat-file/compare/bcrypt-reqs-fix

I'm researching the issue with passlib+bcrypt.

The issue was introduced in bcrypt 4.1.0, which was yanked because of it. This PR moved some attributes around, and added a bcrypt._bcrypt.__version__ attribute, which made passlib assume it was py-bcrypt instead of bcrypt:

But, the error you posted is still present in at least bcrypt 4.1.1 and 4.1.2. That error occurs because bcrypt 4.1.0+ removed bcrypt.__about__, putting the vars under bcrypt._bcrypt and directly under bcrypt instead. That message comes from:

It should just be a logged warning message though, and the version var is only used for a debug log message. So, that error message should be a red herring.

@sravs-dev are there any other logged error messages?

cognifloyd commented 2 weeks ago

Here's the upstream passlib issue: https://foss.heptapod.net/python-libs/passlib/-/issues/190 Sounds like 1.7.5 will drop support for py-bcrypt and bcryptor to remove the bcrypt lib detection mess.

I'm still not sure why st2auth would fail when using bcrypt 4.1.* however since that warning is not fatal.

cognifloyd commented 2 weeks ago

Here's an upstream bcrypt issue about the passlib warning message: https://github.com/pyca/bcrypt/issues/684 They also point out that this is a warning, and passlib+bcrypt should still function without issue.

So, we need to find a different root cause for the reported "st2auth fails to startup" issue, and why downgrading bcrypt would ameliorate that.

sravs-dev commented 2 weeks ago

@cognifloyd Thank you so much for digging into this. I couldn't find any other error message in st2auth logs. Let me rerun st2 on rhel9 and double check.