Closed tlvu closed 3 years ago
Test with Magpie Auth targetting Thredds https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests/blob/master/notebooks-auth/test_thredds.ipynb
Stress test to loop requests: https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests/blob/master/notebooks/stress-tests.ipynb
The Magpie authentication/session creation step from test_thredds
would need to be added in order for requests within stress-test
to employ the same resolution and resources (THREDDS dirs/file).
Another stacktrace when reverting to Waitress with newer Magpie + debug https://github.com/bird-house/birdhouse-deploy/pull/197
Traceback (most recent call last):
File "/opt/birdhouse/src/twitcher/twitcher/tweens.py", line 27, in ows_security_tween
security.check_request(request)
File "/opt/local/src/magpie/magpie/adapter/magpieowssecurity.py", line 171, in check_request
has_permission = authz_policy.permits(service_impl, principals, permission_requested)
File "/usr/local/lib/python3.7/site-packages/pyramid/authorization.py", line 74, in permits
acl = location.__acl__
File "/opt/local/src/magpie/magpie/services.py", line 183, in __acl__
acl = self._get_acl_cached(*cache_keys)
File "/usr/local/lib/python3.7/site-packages/beaker/cache.py", line 601, in cached
return cache[0].get_value(cache_key, createfunc=go)
File "/usr/local/lib/python3.7/site-packages/beaker/cache.py", line 322, in get
return self._get_value(key, **kw).get_value()
File "/usr/local/lib/python3.7/site-packages/beaker/container.py", line 380, in get_value
v = self.createfunc()
File "/usr/local/lib/python3.7/site-packages/beaker/cache.py", line 597, in go
return func(*args, **kwargs)
File "/opt/local/src/magpie/magpie/services.py", line 220, in _get_acl_cached
user = self.user_requested()
File "/opt/local/src/magpie/magpie/services.py", line 157, in user_requested
user = UserService.by_user_name(anonymous, db_session=self.request.db)
File "/usr/local/lib/python3.7/site-packages/ziggurat_foundations/models/services/user.py", line 330, in by_user_name
return query.first()
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3429, in first
ret = list(self[0:1])
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3203, in __getitem__
return list(res)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3535, in __iter__
return self._execute_and_instances(context)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3557, in _execute_and_instances
querycontext, self._connection_from_session, close_with_result=True
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3572, in _get_bind_args
mapper=self._bind_mapper(), clause=querycontext.statement, **kw
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3550, in _connection_from_session
conn = self.session.connection(**kw)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 1145, in connection
execution_options=execution_options,
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 1151, in _connection_for_bind
engine, execution_options
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 458, in _connection_for_bind
self.session.dispatch.after_begin(self.session, self, conn)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/event/attr.py", line 322, in __call__
fn(*args, **kw)
File "/usr/local/lib/python3.7/site-packages/zope/sqlalchemy/datamanager.py", line 269, in after_begin
session, self.initial_state, self.transaction_manager, self.keep_session
File "/usr/local/lib/python3.7/site-packages/zope/sqlalchemy/datamanager.py", line 234, in join_transaction
session, initial_state, transaction_manager, keep_session=keep_session
File "/usr/local/lib/python3.7/site-packages/zope/sqlalchemy/datamanager.py", line 89, in __init__
transaction_manager.get().join(self)
File "/usr/local/lib/python3.7/site-packages/transaction/_manager.py", line 90, in get
raise NoTransaction()
transaction.interfaces.NoTransaction
Full logs twitcher-cache-error-with-waitress.txt
Same stacktrace above with gunicorn as well, matching this notebook error:
12:55:07 _____ pavics-sdi-master/docs/source/notebooks/pavics_thredds.ipynb::Cell 4 _____
12:55:07 Notebook cell execution failed
12:55:07 Cell 4: Cell execution caused an exception
12:55:07
12:55:07 Input:
12:55:07 # NBVAL_IGNORE_OUTPUT
12:55:07
12:55:07 AUTH_USR = os.getenv("TEST_MAGPIE_AUTHTEST_USERNAME", "authtest")
12:55:07 AUTH_PWD = os.getenv("TEST_MAGPIE_AUTHTEST_PASSWORD", "authtest1234")
12:55:07
12:55:07 # Open session
12:55:07 with requests.Session() as session:
12:55:07 session.auth = MagpieAuth(f"https://{PAVICS_HOST}/magpie", AUTH_USR, AUTH_PWD)
12:55:07 # Open a PyDAP data store and pass it to xarray
12:55:07 store = xr.backends.PydapDataStore.open(SECURED_URL, session=session)
12:55:07 ds = xr.open_dataset(store, decode_cf=False) # Attributes are problematic with this file.
12:55:07 ds
12:55:07
12:55:07 Traceback:
12:55:07
12:55:07 ---------------------------------------------------------------------------
12:55:07 HTTPError Traceback (most recent call last)
12:55:07 /tmp/ipykernel_429/1184836852.py in <module>
12:55:07 8 session.auth = MagpieAuth(f"https://{PAVICS_HOST}/magpie", AUTH_USR, AUTH_PWD)
12:55:07 9 # Open a PyDAP data store and pass it to xarray
12:55:07 ---> 10 store = xr.backends.PydapDataStore.open(SECURED_URL, session=session)
12:55:07 11 ds = xr.open_dataset(store, decode_cf=False) # Attributes are problematic with this file.
12:55:07 12 ds
12:55:07
12:55:07 /opt/conda/envs/birdy/lib/python3.7/site-packages/xarray/backends/pydap_.py in open(cls, url, session)
12:55:07 91 def open(cls, url, session=None):
12:55:07 92
12:55:07 ---> 93 ds = pydap.client.open_url(url, session=session)
12:55:07 94 return cls(ds)
12:55:07 95
12:55:07
12:55:07 /opt/conda/envs/birdy/lib/python3.7/site-packages/pydap/client.py in open_url(url, application, session, output_grid, timeout)
12:55:07 65 """
12:55:07 66 dataset = DAPHandler(url, application, session, output_grid,
12:55:07 ---> 67 timeout).dataset
12:55:07 68
12:55:07 69 # attach server-side functions
12:55:07
12:55:07 /opt/conda/envs/birdy/lib/python3.7/site-packages/pydap/handlers/dap.py in __init__(self, url, application, session, output_grid, timeout)
12:55:07 52 ddsurl = urlunsplit((scheme, netloc, path + '.dds', query, fragment))
12:55:07 53 r = GET(ddsurl, application, session, timeout=timeout)
12:55:07 ---> 54 raise_for_status(r)
12:55:07 55 if not r.charset:
12:55:07 56 r.charset = 'ascii'
12:55:07
12:55:07 /opt/conda/envs/birdy/lib/python3.7/site-packages/pydap/net.py in raise_for_status(response)
12:55:07 37 detail=response.status+'\n'+response.text,
12:55:07 38 headers=response.headers,
12:55:07 ---> 39 comment=response.body
12:55:07 40 )
12:55:07 41
12:55:07
12:55:07 HTTPError: 500 Internal Server Error
12:55:07 <?xml version="1.0" encoding="utf-8"?>
12:55:07 <ExceptionReport version="1.0.0"
12:55:07 xmlns="http://www.opengis.net/ows/1.1"
12:55:07 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
12:55:07 xsi:schemaLocation="http://www.opengis.net/ows/1.1 http://schemas.opengis.net/ows/1.1.0/owsExceptionReport.xsd">
12:55:07 <Exception exceptionCode="NoApplicableCode" locator="NoApplicableCode">
12:55:07 <ExceptionText>Unknown Error</ExceptionText>
12:55:07 </Exception>
12:55:07 </ExceptionReport>
Plus this run has a new bunch of 408 in stress-test.ipynb and I even got them on CRIM's Jenkins https://daccs-jenkins.crim.ca/job/PAVICS-e2e-workflow-tests/job/master/606/console
Full log twitcher-cache-error-with-gunicorn.txt.gz
@fmigneault I hope you have all the debug infos you need in the 2 logs I posted. We still have problem with both waitress and gunicorn.
Maybe this only happen in a full PAVICS env. You should try to reproduce on your side in a full PAVICs env.
Describe the bug Many random hard to reproduce errors with several services behind Twitcher/Magpie when caching is turned on for Twitcher.
With caching not working, this issue https://github.com/bird-house/twitcher/issues/97 might have to be re-open?
To Reproduce Steps to reproduce the behavior:
===Problem 1===
Matching twitcher server-side logs:
===Problem 2===
This access denied error should never happend, no logs on twitcher server side
===Problem 3===
408 Error in stress-test.ipynb reproduced on CRIM Jenkins as well https://daccs-jenkins.crim.ca/job/PAVICS-e2e-workflow-tests/job/master/570/console (all errors in stress-test.ipynb was gone for 3 weeks when caching was turned off, see https://github.com/Ouranosinc/Magpie/issues/433).
===Problem 4===
"Remote end closed connection without response" without any errors in Twitcher logs: