geosolutions-it / C195-azure-workspace

1 stars 2 forks source link

Check connection issues #2

Closed etj closed 3 years ago

etj commented 3 years ago

When running all of the containers, CKAN sends responses really slowly. First page is rendered in about 20 seconds, datasets page opens in >2 minutes, logs report lots of DB connection problems.

We may need specific sqlalchemy parameters. Pls refer to https://docs.ckan.org/en/2.9/maintaining/configuration.html#ckan-datastore-sqlalchemy for configuring a connection pool and connection checks.

randomorder commented 3 years ago

@lpasquali , @etj is adding a README 1a to the repo for the setup of the system via bash scripts.

The scripts should allow you to set up a system as depicted by diagram 1b with all services running as containers and public network intercommunication.

We want you to:

lpasquali commented 3 years ago

ckan has problems connecting to solr

ckan@wk-caas-5c72a57a510f4684a4e311aef3d2afd9-4d26b7de2dc1bbf64158a0:~/venv/src/ckan$ ss -lutanp | grep 8983
tcp    TIME-WAIT  0       0        10.244.154.95:56498      20.73.217.25:8983
ckan@wk-caas-5c72a57a510f4684a4e311aef3d2afd9-4d26b7de2dc1bbf64158a0:~/venv/src/ckan$

also ckan has no paster tool, which is needed to create an admin user:

https://docs.ckan.org/en/2.8/maintaining/installing/install-from-docker-compose.html#create-ckan-admin-user

thus said I still have to find why ckan has problems connecting to solr as if I use wget internally in the container (the only tool present to do http connections) it works.

randomorder commented 3 years ago

IF you edit the imagemaybe you can try with tcpdump or similar

lpasquali commented 3 years ago

I installed tcpdump on both images, tomorro I will try to catch some traffic

lpasquali commented 3 years ago

@randomorder @etj I am sniffing the net from both the containers Interesting sidenote: what is odd is that to do it I had to disable the unprivileged user in ckan and this improved a bit,but not lots, the performance of the ckan container

lpasquali commented 3 years ago

traffic is evident between ckan and solr with tcpdump adding sqlalchemy pooling looks to have give some more small improvements but we are still around 5s to load "/about that is a static page more or less...

lpasquali commented 3 years ago

I went forward implementing solution 1c (https://docs.google.com/drawings/d/1Sm-b5NsSLNmoBd-sCnHP9E0eVBvMCrNoodtjb8AN0vg/edit) but found some issues here is the situation:

Thus said I still need to migrate data on the azure postgres, but I need to find a workaround about that account name issue enlisted above.

lpasquali commented 3 years ago

I'm back on this

lpasquali commented 3 years ago
  • NOT OK redis azure managed service at the moment required auth and didn't work out of the box, as other one didn't need auth, I still need to check on how to disable auth or how to authenticate in ckan

I made this work

  • NOT OK postgres as an azure service has usernames which come in 'user@hostname' form and ckan has problems with that.

I made this work

  • NOT OK as discussed with @etj, since solr container can't have both public and private network stack, solr will be on private net only, unfortunately the ip address can't be assiociated to any of the microsoft autoprovisioning dns services

I made this work adding a private dns record

Thus said I still need to migrate data on the azure postgres, but I need to find a workaround about that account name issue enlisted above.

this was done successfully

randomorder commented 3 years ago

Connectio issue debug is on @etj , right?

lpasquali commented 3 years ago
ckan          | 2021-02-25 10:50:17,320 DEBUG [repoze.who] identifier plugins registered: [<FriendlyFormPlugin 140272518243552>, <CkanAuthTktCookiePlugin 140272518242488>]
ckan          | 2021-02-25 10:50:17,321 DEBUG [repoze.who] identifier plugins matched for classification "browser": [<FriendlyFormPlugin 140272518243552>, <CkanAuthTktCookiePlugin 140272518242488>]
ckan          | 2021-02-25 10:50:17,321 DEBUG [repoze.who] no identity returned from <FriendlyFormPlugin 140272518243552> (None)
ckan          | 2021-02-25 10:50:17,321 DEBUG [repoze.who] identity returned from <CkanAuthTktCookiePlugin 140272518242488>: {'timestamp': 1613643505, 'repoze.who.plugins.auth_tkt.userid': 'admin', 'tokens': [''], 'userdata': {}}
ckan          | 2021-02-25 10:50:17,322 DEBUG [repoze.who] identities found: [(<CkanAuthTktCookiePlugin 140272518242488>, {'timestamp': 1613643505, 'repoze.who.plugins.auth_tkt.userid': 'admin', 'tokens': [''], 'userdata': {}})]
ckan          | 2021-02-25 10:50:17,322 DEBUG [repoze.who] authenticator plugins registered: [<CkanAuthTktCookiePlugin 140272518242488>, <ckan.lib.authenticator.UsernamePasswordAuthenticator object at 0x7f93bd98b3c8>]
ckan          | 2021-02-25 10:50:17,322 DEBUG [repoze.who] authenticator plugins matched for classification "browser": [<CkanAuthTktCookiePlugin 140272518242488>, <ckan.lib.authenticator.UsernamePasswordAuthenticator object at 0x7f93bd98b3c8>]
ckan          | 2021-02-25 10:50:17,322 DEBUG [repoze.who] userid returned from <CkanAuthTktCookiePlugin 140272518242488>: "admin"
ckan          | 2021-02-25 10:50:17,322 DEBUG [repoze.who] no userid returned from <ckan.lib.authenticator.UsernamePasswordAuthenticator object at 0x7f93bd98b3c8>: (None)
ckan          | 2021-02-25 10:50:17,323 DEBUG [repoze.who] identities authenticated: [((0, 0), <CkanAuthTktCookiePlugin 140272518242488>, <CkanAuthTktCookiePlugin 140272518242488>, {'timestamp': 1613643505, 'repoze.who.plugins.auth_tkt.userid': 'admin', 'tokens': [''], 'userdata': {}, 'repoze.who.userid': 'admin'}, 'admin')]
ckan          | Traceback (most recent call last):
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1177, in _execute_context
ckan          |     conn = self._revalidate_connection()
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 463, in _revalidate_connection
ckan          |     "Can't reconnect until invalid "
ckan          | sqlalchemy.exc.InvalidRequestError: Can't reconnect until invalid transaction is rolled back
ckan          | 
ckan          | The above exception was the direct cause of the following exception:
ckan          | 
ckan          | Traceback (most recent call last):
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/flask/app.py", line 2449, in wsgi_app
ckan          |     response = self.handle_exception(e)
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/flask/app.py", line 1866, in handle_exception
ckan          |     reraise(exc_type, exc_value, tb)
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
ckan          |     raise value
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/flask/app.py", line 2446, in wsgi_app
ckan          |     response = self.full_dispatch_request()
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/flask/app.py", line 1951, in full_dispatch_request
ckan          |     rv = self.handle_user_exception(e)
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/flask/app.py", line 1820, in handle_user_exception
ckan          |     reraise(exc_type, exc_value, tb)
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
ckan          |     raise value
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/flask/app.py", line 1947, in full_dispatch_request
ckan          |     rv = self.preprocess_request()
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/flask/app.py", line 2241, in preprocess_request
ckan          |     rv = func()
ckan          |   File "/usr/lib/ckan/venv/src/ckan/ckan/config/middleware/flask_app.py", line 369, in ckan_before_request
ckan          |     app_globals.app_globals._check_uptodate()
ckan          |   File "/usr/lib/ckan/venv/src/ckan/ckan/lib/app_globals.py", line 202, in _check_uptodate
ckan          |     value = model.get_system_info('ckan.config_update')
ckan          |   File "/usr/lib/ckan/venv/src/ckan/ckan/model/system_info.py", line 47, in get_system_info
ckan          |     obj = meta.Session.query(SystemInfo).filter_by(key=key).first()
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3222, in first
ckan          |     ret = list(self[0:1])
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3012, in __getitem__
ckan          |     return list(res)
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3324, in __iter__
ckan          |     return self._execute_and_instances(context)
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3349, in _execute_and_instances
ckan          |     result = conn.execute(querycontext.statement, self._params)
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 988, in execute
ckan          |     return meth(self, multiparams, params)
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
ckan          |     return connection._execute_clauseelement(self, multiparams, params)
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1107, in _execute_clauseelement
ckan          |     distilled_params,
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1182, in _execute_context
ckan          |     e, util.text_type(statement), parameters, None, None
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1466, in _handle_dbapi_exception
ckan          |     util.raise_from_cause(sqlalchemy_exception, exc_info)
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 399, in raise_from_cause
ckan          |     reraise(type(exception), exception, tb=exc_tb, cause=cause)
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 153, in reraise
ckan          |     raise value.with_traceback(tb)
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1177, in _execute_context
ckan          |     conn = self._revalidate_connection()
ckan          |   File "/usr/lib/ckan/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 463, in _revalidate_connection
ckan          |     "Can't reconnect until invalid "
ckan          | sqlalchemy.exc.StatementError: (sqlalchemy.exc.InvalidRequestError) Can't reconnect until invalid transaction is rolled back
ckan          | [SQL: SELECT system_info.id AS system_info_id, system_info.key AS system_info_key, system_info.value AS system_info_value, system_info.state AS system_info_state 
ckan          | FROM system_info 
ckan          | WHERE system_info.key = %(key_1)s 
ckan          |  LIMIT %(param_1)s]
ckan          | [parameters: [{}]]
etj commented 3 years ago

Still getting the same error after