StackStorm / st2

StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, incident responses, troubleshooting, deployments, and more for DevOps and SREs. Includes rules engine, workflow, 160 integration packs with 6000+ actions (see https://exchange.stackstorm.org) and ChatOps. Installer at https://docs.stackstorm.com/install/index.html
https://stackstorm.com/
Apache License 2.0
6.03k stars 745 forks source link

ST2 3.5 rpm not creating admin roles in mongo; other services not connecting to mongo #5301

Open lukepatrick opened 3 years ago

lukepatrick commented 3 years ago

SUMMARY

Running ST 3.5 rpm’s, Centos8, Mongo 4, etc.. At startup, st2-apply-rbac-definitions does not insert the admin role to mongo. Subsequent user assignments (stanley) can’t be loaded as the role doesn’t exist. I can switch back to the 3.4.1 rpm’s and it is not an issue.

In an oddly semi-related issue, the above st2-apply-rbac-definitions job can connect to Mongo and load all the other user assignments without issue. However, none of the other services that start (actionrunner, etc..) can connect to Mongo (all the same settings). All reporting pymongo.errors.ServerSelectionTimeoutError: module 'select' has no attribute 'poll'

STACKSTORM VERSION

Paste the output of st2 --version:

3.5.0-1 rpm's

OS, environment, install method

Post what OS you are running this on, along with any other relevant information/

Kubernetes HA charts, Docker, CentOS8

Steps to reproduce the problem

          - st2-apply-rbac-definitions
          - --verbose
          - --config-file=/etc/st2/st2.conf
          - --config-file=/etc/st2/st2.docker.conf
          - --config-file=/etc/st2/st2.user.conf

start up actionrunner or other service

Expected Results

results with mongo db.role_d_b.find() shows no admin, system_admin, observer roles

actionrunner service to start

Actual Results

no admin roles; cannot assign stanley to admin role

pymongo.errors.ServerSelectionTimeoutError: module 'select' has no attribute 'poll'

Full error:

st2actionrunner 2021-07-12 20:11:54,319 INFO [-] Using Python: 3.6.8 (/opt/stackstorm/st2/bin/python)                                                                                                                        
 st2actionrunner 2021-07-12 20:11:54,320 INFO [-] Using fs encoding: utf-8, default encoding: utf-8, locale: en_US.UTF-8, LANG env variable: en_US.UTF-8, PYTHONIOENCODING env variable: notset                               
 st2actionrunner 2021-07-12 20:11:54,320 INFO [-] Using config files: /etc/st2/st2.conf,/etc/st2/st2.docker.conf,/etc/st2/st2.user.conf                                                                                       
 st2actionrunner 2021-07-12 20:11:54,321 INFO [-] Using logging config: /etc/st2/logging.docker.conf                                                                                                                          
 st2actionrunner 2021-07-12 20:11:54,321 INFO [-] Using coordination driver: redis                                                                                                                                            
 st2actionrunner 2021-07-12 20:11:54,321 INFO [-] Using metrics driver: statsd                                                                                                                                                
 st2actionrunner 2021-07-12 20:11:54,326 INFO [-] Connecting to database "st2" @ "mongodb-0.mongodb-headless:27017,mongodb-1.mongodb-headless:27017,mongodb-2.mongodb-headless:27017 (replica set)" as user "s 
 st2actionrunner 2021-07-12 20:11:57,376 ERROR [-] Failed to connect to database "st2" @ "mongodb-0.mongodb-headless:27017,mongodb-1.mongodb-headless:27017,mongodb-2.mongodb-headless:27017 (replica set)" as 
 st2actionrunner 2021-07-12 20:11:57,376 ERROR [-] (PID=1) Worker quit due to exception.                                                                                                                                      
 st2actionrunner Traceback (most recent call last):                                                                                                                                                                           
 st2actionrunner   File "/opt/stackstorm/st2/lib/python3.6/site-packages/st2actions/cmd/actionrunner.py", line 98, in main                                                                                                    
 st2actionrunner     _setup()                                                                                                                                                                                                 
 st2actionrunner   File "/opt/stackstorm/st2/lib/python3.6/site-packages/st2actions/cmd/actionrunner.py", line 58, in _setup                                                                                                  
 st2actionrunner     capabilities=capabilities,                                                                                                                                                                               
 st2actionrunner   File "/opt/stackstorm/st2/lib/python3.6/site-packages/st2common/service_setup.py", line 244, in setup                                                                                                      
 st2actionrunner     db_setup()                                                                                                                                                                                               
 st2actionrunner   File "/opt/stackstorm/st2/lib/python3.6/site-packages/st2common/database_setup.py", line 55, in db_setup                                                                                                   
 st2actionrunner     connection = db_init.db_setup_with_retry(**db_cfg)                                                                                                                                                       
 st2actionrunner   File "/opt/stackstorm/st2/lib/python3.6/site-packages/st2common/persistence/db_init.py", line 93, in db_setup_with_retry                                                                                   
 st2actionrunner     ssl_match_hostname=ssl_match_hostname,                                                                                                                                                                   
 st2actionrunner   File "/opt/stackstorm/st2/lib/python3.6/site-packages/st2common/persistence/db_init.py", line 58, in db_func_with_retry                                                                                    
 st2actionrunner     return retrying_obj.call(db_func, *args, **kwargs)                                                                                                                                                       
 st2actionrunner   File "/opt/stackstorm/st2/lib/python3.6/site-packages/retrying.py", line 206, in call                                                                                                                      
 st2actionrunner     return attempt.get(self._wrap_exception)                                                                                                                                                                 
 st2actionrunner   File "/opt/stackstorm/st2/lib/python3.6/site-packages/retrying.py", line 247, in get                                                                                                                       
 st2actionrunner     six.reraise(self.value[0], self.value[1], self.value[2])                                                                                                                                                 
 st2actionrunner   File "/opt/stackstorm/st2/lib/python3.6/site-packages/six.py", line 696, in reraise                                                                                                                        
 st2actionrunner     raise value                                                                                                                                                                                              
 st2actionrunner   File "/opt/stackstorm/st2/lib/python3.6/site-packages/retrying.py", line 200, in call                                                                                                                      
 st2actionrunner     attempt = Attempt(fn(*args, **kwargs), attempt_number, False)                                                                                                                                            
 st2actionrunner   File "/opt/stackstorm/st2/lib/python3.6/site-packages/st2common/models/db/__init__.py", line 250, in db_setup                                                                                              
 st2actionrunner     ssl_match_hostname=ssl_match_hostname,                                                                                                                                                                   
 st2actionrunner   File "/opt/stackstorm/st2/lib/python3.6/site-packages/st2common/models/db/__init__.py", line 212, in _db_connect                                                                                           
 st2actionrunner     raise e                                                                                                                                                                                                  
 st2actionrunner   File "/opt/stackstorm/st2/lib/python3.6/site-packages/st2common/models/db/__init__.py", line 203, in _db_connect                                                                                           
 st2actionrunner     connection.admin.command("ismaster")                                                                                                                                                                     
 st2actionrunner   File "/opt/stackstorm/st2/lib/python3.6/site-packages/pymongo/database.py", line 737, in command                                                                                                           
 st2actionrunner     read_preference, session) as (sock_info, slave_ok):                                                                                                                                                      
 st2actionrunner   File "/usr/lib64/python3.6/contextlib.py", line 81, in __enter__                                                                                                                                           
 st2actionrunner     return next(self.gen)                                                                                                                                                                                    
 st2actionrunner   File "/opt/stackstorm/st2/lib/python3.6/site-packages/pymongo/mongo_client.py", line 1325, in _socket_for_reads                                                                                            
 st2actionrunner     server = self._select_server(read_preference, session)                                                                                                                                                   
 st2actionrunner   File "/opt/stackstorm/st2/lib/python3.6/site-packages/pymongo/mongo_client.py", line 1278, in _select_server                                                                                               
 st2actionrunner     server = topology.select_server(server_selector)                                                                                                                                                         
 st2actionrunner   File "/opt/stackstorm/st2/lib/python3.6/site-packages/pymongo/topology.py", line 243, in select_server                                                                                                     
 st2actionrunner     address))                                                                                                                                                                                                
 st2actionrunner   File "/opt/stackstorm/st2/lib/python3.6/site-packages/pymongo/topology.py", line 200, in select_servers                                                                                                    
 st2actionrunner     selector, server_timeout, address)                                                                                                                                                                       
 st2actionrunner   File "/opt/stackstorm/st2/lib/python3.6/site-packages/pymongo/topology.py", line 217, in _select_servers_loop                                                                                              
 st2actionrunner     (self._error_message(selector), timeout, self.description))                                                                                                                                              
 st2actionrunner pymongo.errors.ServerSelectionTimeoutError: module 'select' has no attribute 'poll',module 'select' has no attribute 'poll',module 'select' has no attribute 'poll', Timeout: 3.0s, Topology Description: <T 
arm4b commented 3 years ago

Thanks for the report!

@lukepatrick for more info, what was the MongoDB version used?

I think we've tested with MongoDB 4.0 for EL8 (Py3.6) and MongoDB 4.4 for U20 (Py 3.8). For stackstorm-ha in K8s we're running on U18 with MongoDB 4.0 and it's good.

From the other side, In v3.5.0 we introduced py 3.8 to support Ubuntu20 Focal LTS and there were some py3.6/py3.6 changes involved.

Another random idea, maybe custom EL8 build around K8s brings some edge cases with versioning. FYI https://github.com/StackStorm/stackstorm-ha/ is based on Ubuntu 18.04 LTS + MongoDB 4.0.

@StackStorm/maintainers @StackStorm/contributors WDYT? Something around the monkey patching, py 3.6/3.8 version changes and pip dependencies?

lukepatrick commented 3 years ago

Ran both a Mongo 4.0 and 4.4, seemed to get the same issue.

amanda11 commented 3 years ago

Odd... To check its not a general problem. I just started a new ST2 vagrant box with CentOS 8, and then did the standard instructions to configure RBAC with st2admin and stanley, and all good. No problem with st2-apply-rbac-definitions, and logged in fine as st2admin etc.

So seems related to this environment, rather than newer versions of say pip dependencies coming in...

Could be fact it's Kubernetes, or maybe mongo replicaset (just looking at where it is in the stack trace)...

]# st2-apply-rbac-definitions --config-file /etc/st2/st2.conf
2021-07-13 21:37:32,836 INFO [-] Connecting to database "st2" @ "127.0.0.1:27017" as user "stackstorm".
2021-07-13 21:37:32,843 INFO [-] Successfully connected to database "st2" @ "127.0.0.1:27017" as user "stackstorm".
2021-07-13 21:37:33,318 INFO [-] Loading role definitions from "/opt/stackstorm/rbac/roles/"
2021-07-13 21:37:33,318 INFO [-] Loading user role assignments from "/opt/stackstorm/rbac/assignments/"
2021-07-13 21:37:33,324 INFO [-] Loading group to role map definitions from "/opt/stackstorm/rbac/mappings/"
2021-07-13 21:37:33,324 INFO [-] Synchronizing roles...
2021-07-13 21:37:33,330 INFO [-] Roles synchronized (0 created, 0 updated, 0 removed)
2021-07-13 21:37:33,330 INFO [-] Synchronizing users role assignments...
2021-07-13 21:37:33,371 INFO [-] User role assignments synchronized
2021-07-13 21:37:33,371 INFO [-] Synchronizing group to role maps...
2021-07-13 21:37:33,374 INFO [-] Group to role map definitions synchronized.
# st2 --version
st2 3.5.0, on Python 3.6.8
arms11 commented 3 years ago

While this is not apple to apple, I recently updated my sandbox cluster with changing the image tag version to 3.5.0. But, I could see no such issue regarding my K8s setup in EKS. Performed smoke testing without any issue. Please note, I have not updated my chart to the latest available 0.60.0. I am still using 0.52.0 and I also do not have RBAC enabled. I will do my best to spin off a new cluster from scratch and try to check this with out-of-box chart and RBAC enabled.

lukepatrick commented 3 years ago

The not-loading of rbac admin role seems transient, more times I try it does or doesn't happen - always with a fresh/empty MongoDB. Not as worried about that.

For the other mongo connectivity issue, a separate task I had updated some packages, seems the latest pymongo isn't compatible. Cleared out any (that I can tell) extraneous package updates and it works fine.

Once this got running, ran into another odd package issue

Traceback (most recent call last):
  File \"/opt/stackstorm/st2/lib/python3.6/site-packages/python_runner/python_action_wrapper.py\", line 35, in <module>
    import orjson
ModuleNotFoundError: No module named 'orjson'
amanda11 commented 3 years ago

Orjson 3.5.2 is in the pip requirements files - did this get cleared out accidentally on deleting other packages?

lukepatrick commented 3 years ago

no, it's in the pip list

[root@st2-stackstorm-st2actionrunner-6c5546f847-9jgzw stackstorm]# source /opt/stackstorm/st2/bin/activate
(st2) [root@st2-stackstorm-st2actionrunner-6c5546f847-9jgzw stackstorm]# pip list
Package                        Version
------------------------------ ---------
amqp                           5.0.6
appdirs                        1.4.4
APScheduler                    3.7.0
argcomplete                    1.12.2
Babel                          2.9.1
bcrypt                         3.2.0
beautifulsoup4                 4.9.3
cachetools                     2.0.1
certifi                        2021.5.30
cffi                           1.14.5
chardet                        3.0.4
click                          8.0.1
colorama                       0.4.4
cryptography                   3.4.7
debtcollector                  2.2.0
decorator                      5.0.9
distlib                        0.3.2
dnspython                      1.16.0
eventlet                       0.30.2
fasteners                      0.16.3
filelock                       3.0.12
flex                           6.14.1
futurist                       2.3.0
gitdb                          4.0.2
GitPython                      3.1.15
greenlet                       1.0.0
gunicorn                       20.1.0
httplib2                       0.19.1
idna                           2.10
importlib-metadata             3.10.1
importlib-resources            5.1.4
iso8601                        0.1.14
Jinja2                         2.11.3
jsonpath-rw                    1.4.0
jsonpointer                    2.1
jsonschema                     2.6.0
kazoo                          2.8.0
kombu                          5.0.2
linecache2                     1.0.0
lockfile                       0.12.2
logshipper                     0.0.0
MarkupSafe                     2.0.1
mock                           4.0.3
mongoengine                    0.23.0
msgpack                        1.0.2
netaddr                        0.8.0
netifaces                      0.11.0
networkx                       1.11
nose                           1.3.7
nose-parallel                  0.4.0
nose-timer                     1.0.1
ntlm-auth                      1.5.0
orjson                         3.5.2
orquesta                       1.4.0
oslo.config                    8.7.0
oslo.i18n                      5.0.1
oslo.serialization             4.1.0
oslo.utils                     4.9.1
packaging                      20.9
paramiko                       2.7.2
passlib                        1.7.4
pbr                            5.6.0
pika                           1.2.0
pip                            20.3.3
ply                            3.11
prettytable                    2.1.0
prompt-toolkit                 1.0.15
psutil                         5.8.0
pyasn1                         0.4.8
pyasn1-modules                 0.2.8
pycparser                      2.20
pyinotify                      0.9.6
pymongo                        3.11.3
PyNaCl                         1.4.0
pyOpenSSL                      20.0.1
pyparsing                      2.4.7
pyrabbit                       1.1.0
python-dateutil                2.8.1
python-editor                  1.0.4
python-json-logger             2.0.1
python-ldap                    3.0.0
python-statsd                  2.1.0
pytz                           2021.1
pywinrm                        0.4.1
PyYAML                         5.4.1
RandomWords                    0.3.0
redis                          3.5.3
rednose                        1.3.0
repoze.lru                     0.7
requests                       2.25.1
requests-ntlm                  1.1.0
retrying                       1.3.3
rfc3986                        1.5.0
rfc3987                        1.3.8
Routes                         2.4.1
semver                         2.13.0
setuptools                     39.2.0
simplejson                     3.17.2
six                            1.13.0
smmap                          3.0.5
soupsieve                      2.2.1
sseclient-py                   1.7
st2                            3.5.0
st2-auth-backend-flat-file     0.1.0
st2-auth-backend-pam           0.2.0
st2-auth-ldap                  3.5.dev0
st2-rbac-backend               3.5.0
st2actions                     3.5.0
st2api                         3.5.0
st2auth                        3.5.0
st2client                      3.5.0
st2common                      3.5.0
st2exporter                    3.5.0
st2reactor                     3.5.0
st2stream                      3.5.0
st2tests                       3.5.0
stackstorm-runner-action-chain 3.5.0
stackstorm-runner-announcement 3.5.0
stackstorm-runner-http         3.5.0
stackstorm-runner-inquirer     3.5.0
stackstorm-runner-local        3.5.0
stackstorm-runner-noop         3.5.0
stackstorm-runner-orquesta     3.5.0
stackstorm-runner-python       3.5.0
stackstorm-runner-remote       3.5.0
stackstorm-runner-winrm        3.5.0
stevedore                      1.30.1
strict-rfc3339                 0.7
tenacity                       7.0.0
termstyle                      0.1.11
tooz                           2.8.0
traceback2                     1.4.0
typing-extensions              3.10.0.0
tzlocal                        2.1
udatetime                      0.0.16
ujson                          4.0.2
unittest2                      1.1.0
urllib3                        1.26.5
validate-email                 1.3
vine                           5.0.0
virtualenv                     20.4.0
voluptuous                     0.12.1
waitress                       2.0.0
wcwidth                        0.2.5
WebOb                          1.8.7
WebTest                        2.0.35
wheel                          0.31.1
wrapt                          1.12.1
xmltodict                      0.12.0
yaql                           1.1.3
zake                           0.2.2
zipp                           3.4.1
zstandard                      0.15.2
WARNING: You are using pip version 20.3.3; however, version 21.1.3 is available.
You should consider upgrading via the '/opt/stackstorm/st2/bin/python -m pip install --upgrade pip' command.
lukepatrick commented 3 years ago

fyi, I've added a experimental PR with a Centos container https://github.com/StackStorm/st2-dockerfiles/pull/50

arm4b commented 3 years ago

Thanks @lukepatrick.

Just a note that building python 3.7 from sources to run the st2 packs might have edge cases. We currently support and test with py 3.6 and 3.8 and unsure how other environments would behave.

But unsure what causes the issue you're describing at the first message as the base st2 system is relying on py3.6 rpm package per your Dockerfile.

lukepatrick commented 3 years ago

Thanks for looking @armab

I'm hoping the alternative python does not conflict with Stackstorm's py3.6; it has not been an issue in the past versions.

We only due this to a bunch of home-grown packs we created at py3.7.