Closed AlexPeshkoff closed 1 year ago
check for correct session name in sessions list.
We have to use following syntax to STOP trace session:
fbtracemgr -SE service_mgr -STOP -ID <NNN>
(i.e. only using its ID rather than name). One need to use cumbersome (and fragile) manipulations to obtain session ID via known trace session name:
$ /opt/fb40tmp/bin/fbtracemgr -se localhost:service_mgr -user sysdba -pas ********* -start -name trace_5441 -config /var/tmp/trc_5441.cfg
$ /opt/fb40tmp/bin/fbtracemgr -se localhost:service_mgr -user sysdba -pas ********* -start -name trace_5470 -config /var/tmp/trc_5470.cfg
$ /opt/fb40tmp/bin/fbtracemgr -se localhost:service_mgr -user sysdba -pas masterkey -list
Session ID: 2
name: trace_5441
user: SYSDBA
date: 2022-10-26 17:13:58
flags: active, trace
Session ID: 3
name: trace_5470
user: SYSDBA
date: 2022-10-26 17:14:14
flags: active, trace
# get trace ID for name = 'trace_5470':
[ibase@glb-srv-ik-test firebird-qa]$ (/opt/fb40tmp/bin/fbtracemgr -se localhost:service_mgr -user sysdba -pas masterkey -list | grep -B1 " name: trace_5470" ) | grep "Session ID: | awk '{print $3}'"
3
This way seems to me poor because it strongly depends on FORMAT of 'fbtracemgr -list' output. Why we can not stop trace just using its name, without preliminary search of its ID ?
PS. Yes, i remember that we have discussed this privately a few years ago; but may be it is good to repeat for all involved persons.
In script - yes, need to parse (though in 1 line using shell and Unix utilities). But in python program one gets session name in service reply in separate clumplet, i.e. finding it is trivial.
The qa plugin does not use arbitrary or fixed session ID, it uses the session ID reported back by service when session is started. Well, the firebird-driver have to parse the text output to get it, but anyway. The qa plugin also checks that trace was properly stopped, as the service response to stop request is normally "Trace session ID N stopped". This check is actually the reason why this Firebird issue was discovered. So, your statement that "method to stop service via fixed session ID is not reliable" for qa suite AFAIK does not apply here.
Sorry - you are absolutely right, problem is badly written test.
I still have a question about why pytest HANGED after show final message about failed test (i.e. when bug was not yet fixed). Consider following test:
[ibase@glb-srv-ik-test bugs]$ cat -n /home/ibase/firebird-qa/tests/bugs/core_hang_test.py
1 #coding:utf-8
2
3 import time
4 import pytest
5 from firebird.qa import *
6
7 db = db_factory()
8 act = python_act('db')
9
10 trace_5441 = [
11 'log_statement_finish = true',
12 ]
13 #-------------------------------------------
14 trace_5470 = [
15 'log_statement_finish = true',
16 'include_filter = "%foobar%"',
17 ]
18 #'exclude_filter = "%table%"',
19 #-------------------------------------------
20 @pytest.mark.version('>=3.0')
21 def test_1(act: Action):
22 with act.trace(db_events = trace_5441) as ts:
23 # def trace() -> TraceSession
24 if ts.output:
25 # see core.py (firebird-driver),
26 # class ServerTraceServices(ServerServiceProvider),
27 # def start(self, *, config: str, name: str=None) -> int
28 # ...
29 # response = self._srv()._fetch_line()
30 # return int(response.split()[3])
31 session = str(ts.output[0])
32 else:
33 session = '[no output]'
34 with open('/var/tmp/trace_5441.session.tmp', 'w') as f:
35 f.write(f'trace_5441: session = {session}\n')
36 #pass
37 assert True
38
39 #----------------------------------------------
40 @pytest.mark.version('>=3.0')
41 def test_2(act: Action):
42 #time.sleep(0.5) # ----------------------------------- [ NB ]
43 with act.trace(db_events = trace_5470) as ts:
44 # def trace() -> TraceSession
45 if ts.output:
46 session = str(ts.output[0])
47 else:
48 session = '[no output]'
49 with open('/var/tmp/trace_5470.session.tmp', 'w') as f:
50 # if we take delay for at least 0.5 second then session will be 1
51 # otherwise it will be 2 and pytest hangs infinitely at the final point.
52 f.write(f'trace_5470: session = {session}\n')
53 #pass
54 assert True
Pay note at line #42: #time.sleep(0.5) # ----------------------------------- [ NB ]
If we comment it out then running this test on v4.0.3.2861, Classic, lead to following console output:
[ibase@glb-srv-ik-test firebird-qa]$ cd ~/firebird-qa ; /usr/local/bin/pytest --cache-clear --disable-warnings --basetemp=/var/tmp/fbqa -vv --tb=long --server fb401 ./tests
=========================================================================================== test session starts ===========================================================================================
platform linux -- Python 3.9.9, pytest-7.0.1, pluggy-1.0.0 -- /usr/local/bin/python3.9
cachedir: .pytest_cache
System:
encodings: sys:utf-8 locale:UTF-8 filesystem:utf-8
Firebird:
server: fb401 [v4.0.3.2861, Classic, Firebird/Linux/AMD/Intel/x64]
home: /opt/fb40
bin: /opt/fb40/bin
client library: /opt/fb40/lib/libfbclient.so
rootdir: /home/ibase/firebird-qa, configfile: pytest.ini
plugins: firebird-qa-0.17.0, timestamper-0.0.9
collected 2 items
[2022-10-28 13:01:03] tests/bugs/core_hang_test.py::test_1 PASSED [1/2]
[2022-10-28 13:01:06] tests/bugs/core_hang_test.py::test_2 FAILED [2/2]
================================================================================================ FAILURES =================================================================================================
_________________________________________________________________________________________________ test_2 __________________________________________________________________________________________________
act = <firebird.qa.plugin.Action object at 0x7efe0f047cd0>
@pytest.mark.version('>=3.0')
def test_2(act: Action):
#time.sleep(0.5) # ----------------------------------- [ NB ]
with act.trace(db_events = trace_5470) as ts:
# def trace() -> TraceSession
if ts.output:
session = str(ts.output[0])
else:
session = '[no output]'
with open('/var/tmp/trace_5470.session.tmp', 'w') as f:
# if we take delay for at least 0.5 second then session will be 1
# otherwise it will be 2 and pytest hangs infinitely at the final point.
> f.write(f'trace_5470: session = {session}\n')
tests/bugs/core_hang_test.py:52:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <firebird.qa.plugin.TraceSession object at 0x7efe0f047940>, exc_type = None, exc_value = None, traceback = None
def __exit__(self, exc_type, exc_value, traceback) -> None:
time.sleep(2)
session = self.output.pop(0)
with self.act.connect_server() as srv:
> srv.trace.stop(session_id=session)
firebird/qa/plugin.py:1155:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <firebird.driver.core.ServerTraceServices object at 0x7efe1405fa00>
def stop(self, *, session_id: int) -> str:
"""Stop trace session.
Arguments:
session_id: Trace session ID.
Returns:
Text message 'Trace session ID <x> stopped'.
"""
#print('Intro trace stop(), session_id=',session_id)
> return self.__action(ServerAction.TRACE_STOP, 'stopped', session_id)
../.local/lib/python3.9/site-packages/firebird/driver/core.py:5205:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <firebird.driver.core.ServerTraceServices object at 0x7efe1405fa00>, action = <ServerAction.TRACE_STOP: 23>, label = 'stopped', session_id = 2
def __action(self, action: ServerAction, label: str, session_id: int) -> str:
self._srv()._reset_output()
with a.get_api().util.get_xpb_builder(XpbKind.SPB_START) as spb:
spb.insert_tag(action)
spb.insert_int(SrvTraceOption.ID, session_id)
self._srv()._svc.start(spb.get_buffer())
response = self._srv()._fetch_line()
# 14.10.2022, temporary(?)
#if action == ServerAction.TRACE_STOP:
# print('@@@ trace stopping: ', action.name, response)
#else:
# print('@@@ other trace action: ', action.name)
if not response.startswith(f"Trace session ID {session_id} {label}"): # pragma: no cover
# response should contain the error message
#print('@@@ bad response for action ', action.name, ' : ', response)
> raise DatabaseError(response)
E firebird.driver.types.DatabaseError: Trace session ID 2 not found
../.local/lib/python3.9/site-packages/firebird/driver/core.py:5168: DatabaseError
------------------------------------------------------------------------------------------ Captured stdout setup ------------------------------------------------------------------------------------------
Creating db: localhost:/var/tmp/fbqa/test_20/test.fdb [page_size=None, sql_dialect=None, charset='NONE', user=SYSDBA, password=masterkey]
========================================================================================= short test summary info =========================================================================================
FAILED tests/bugs/core_hang_test.py::test_2 - firebird.driver.types.DatabaseError: Trace session ID 2 not found
================================================================================= 1 failed, 1 passed, 1 warning in 5.86s ==================================================================================
And pytest will infinitely HANG, w/o returning control to OS until we press Ctrl-C. But why it happens ?
Pressing Ctrl-C leads to following additional output:
Traceback (most recent call last):
File "/home/ibase/firebird-qa/firebird/qa/plugin.py", line 1088, in trace_thread
for line in srv:
File "/home/ibase/.local/lib/python3.9/site-packages/firebird/driver/core.py", line 5301, in __next__
if (line := self.readline()) is not None:
File "/home/ibase/.local/lib/python3.9/site-packages/firebird/driver/core.py", line 5404, in readline
self._read_output()
File "/home/ibase/.local/lib/python3.9/site-packages/firebird/driver/core.py", line 5350, in _read_output
data = self._query_output(timeout)
File "/home/ibase/.local/lib/python3.9/site-packages/firebird/driver/core.py", line 5344, in _query_output
self._svc.query(self._make_request(timeout), bytes([self.mode]), self.response.raw)
File "/home/ibase/.local/lib/python3.9/site-packages/firebird/driver/interfaces.py", line 1039, in query
self._check()
File "/home/ibase/.local/lib/python3.9/site-packages/firebird/driver/interfaces.py", line 105, in _check
raise self.__report(DatabaseError, self.status.get_errors())
firebird.driver.types.DatabaseError: Error reading data from the connection.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/threading.py", line 973, in _bootstrap_inner
self.run()
File "/usr/local/lib/python3.9/threading.py", line 910, in run
self._target(*self._args, **self._kwargs)
File "/home/ibase/firebird-qa/firebird/qa/plugin.py", line 1090, in trace_thread
output.append(line)
File "/home/ibase/.local/lib/python3.9/site-packages/firebird/driver/core.py", line 5295, in __exit__
self.close()
File "/home/ibase/.local/lib/python3.9/site-packages/firebird/driver/core.py", line 5447, in close
self._svc.detach()
File "/home/ibase/.local/lib/python3.9/site-packages/firebird/driver/interfaces.py", line 1031, in detach
self._check()
File "/home/ibase/.local/lib/python3.9/site-packages/firebird/driver/interfaces.py", line 105, in _check
raise self.__report(DatabaseError, self.status.get_errors())
firebird.driver.types.DatabaseError: connection shutdown
Exception ignored in: <module 'threading' from '/usr/local/lib/python3.9/threading.py'>
Traceback (most recent call last):
File "/usr/local/lib/python3.9/threading.py", line 1470, in _shutdown
lock.acquire()
KeyboardInterrupt:
And only after that control will returned to OS. Why Ctrl-C was needed to "finalize" execution ?
PS. firebird.conf:
RemoteServicePort = 34099
ServerMode = Classic
Hi Pavel,
the explanation why pytest hangs is actually quite simple. The trace session runs in separate thread. If no error in service query happens, it ends when trace session is stopped. Because due to Firebird bug the session did not stop properly, the thread is not ended and Python "hangs" on exit (waiting for thread to finish).
I did not (deliberately) added a measure to qa plugin to stop trace thread explicitly (a signal/event to break the service query loop) as it's it complicates things significantly and it's pointless unless there is a BUG in Firebird.
best regards Pavel
On 10/29/22 11:54, Pavel Císař wrote:
Hi Pavel,
the explanation why pytest hangs is actually quite simple. The trace session runs in separate thread. If no error in service query happens, it ends when trace session is stopped. Because due to Firebird bug the session did not stop properly, the thread is not ended and Python "hangs" on exit (waiting for thread to finish).
I did not (deliberately) added a measure to qa plugin to stop trace thread explicitly (a signal/event to break the service query loop) as it's it complicates things significantly and it's pointless unless there is a BUG in Firebird.
On my mind test suite should be ready to do it's best when there is a bug - at least it's designed to detect that bugs ;) I do not insist on changes but do not think that adding:
Pavel,
Dne 31. 10. 22 v 14:12 Alexander Peshkov napsal(a):
On my mind test suite should be ready to do it's best when there is a bug - at least it's designed to detect that bugs ;)
Well, it does, most times ;-)
I do not insist on changes but do not think that adding:
- break on error return from service stop call,
- timeouts when working with services and break when exceeded, significantly complicates the code.
Well, the main problem is that query to get next chunck (to EOF or single line) of trace data BLOCKs (i.e. the trace thread is actually waiting for service query call to finish, so it couldn't do anything else). The only way to get around it is using timeout for service query call, but current implementation of firebird-driver does not support service query with timeout, as I was not able to find a reliable way to support this that would fit the driver API (I'd like avoid backward incompatible changes), and that will also fit the architecture (support for Python iterator protocol etc.).
I would definitely like to add timeouts to firebird-driver sometime in future (as nice to have, at least for trace), but it's definitely not something that I could hack in single day. Also, as nobody so far (over all those years that I develop Python Firebird drivers) asked for service query timeout support (except lately for trace), it was never on top of my priority list.
regards Pavel
Pavel, I've used to fix two bugs in firebird (https://github.com/FirebirdSQL/firebird/issues/7359 & https://github.com/FirebirdSQL/firebird/issues/7361) related with subj. Currently I can say for sure - used in trace-related tests method to stop service via fixed session ID is not reliable. Nobody can guarantee that after session with ID == N will take place N+1, it may reset to 1, and this is normal behavior, specially for CS.
From what Pavel Zotov says mention method is built into test suite. Can you confirm this? If yes, a way to make behavior stable should be found - for example check for correct session name in sessions list.