Closed M-Welsch closed 7 months ago
couldn't reproduce in 28 runs with code
async def main() -> None:
parser = argparse.ArgumentParser()
parser.add_argument("--no-shutdown", default=False, required=False)
parser.add_argument("--config", default="config.yaml", type=str, required=False)
args = parser.parse_args()
cfg = load_config(Path(args.config))
LOG.info(f"loading config file {args.config}")
await init(cfg["logger"])
# await engage()
# await backup(cfg["backup"])
# await disengage()
# await wait_before_shutdown(cfg)
await set_wakeup_time(timedelta(minutes=1))
if not args.no_shutdown:
await shutdown()
happened again 24.1.24
Jan 24 03:41:15 basehw4sn2 sudo[880]: base : PWD=/home/base/backup-server/software/bcu ; USER=root ; COMMAND=/sbin/shutdown -h now
Jan 24 03:41:15 basehw4sn2 sudo[880]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=1001)
Jan 24 03:41:15 basehw4sn2 sudo[880]: pam_unix(sudo:session): session closed for user root
-- Boot f0ce1e227d6d4d019ab2e95155c4143f --
Jan 24 03:41:21 basehw4sn2 python3[439]: Traceback (most recent call last):
Jan 24 03:41:21 basehw4sn2 python3[439]: File "/home/base/backup-server/software/bcu/venv/lib/python3.11/site-packages/serial/serialposix.py", line 322, in open
Jan 24 03:41:22 basehw4sn2 python3[439]: self.fd = os.open(self.portstr, os.O_RDWR | os.O_NOCTTY | os.O_NONBLOCK)
Jan 24 03:41:22 basehw4sn2 python3[439]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Jan 24 03:41:22 basehw4sn2 python3[439]: FileNotFoundError: [Errno 2] No such file or directory: '/dev/ttyBASEPCU'
Jan 24 03:41:22 basehw4sn2 python3[439]: During handling of the above exception, another exception occurred:
Jan 24 03:41:22 basehw4sn2 python3[439]: Traceback (most recent call last):
Jan 24 03:41:22 basehw4sn2 python3[439]: File "<frozen runpy>", line 198, in _run_module_as_main
Jan 24 03:41:22 basehw4sn2 python3[439]: File "<frozen runpy>", line 88, in _run_code
Jan 24 03:41:22 basehw4sn2 python3[439]: File "/home/base/backup-server/software/bcu/__main__.py", line 181, in <module>
Jan 24 03:41:22 basehw4sn2 python3[439]: asyncio.run(main())
Jan 24 03:41:22 basehw4sn2 python3[439]: File "/usr/lib/python3.11/asyncio/runners.py", line 190, in run
Jan 24 03:41:22 basehw4sn2 python3[439]: return runner.run(main)
Jan 24 03:41:22 basehw4sn2 python3[439]: ^^^^^^^^^^^^^^^^
Jan 24 03:41:22 basehw4sn2 python3[439]: File "/usr/lib/python3.11/asyncio/runners.py", line 118, in run
Jan 24 03:41:22 basehw4sn2 python3[439]: return self._loop.run_until_complete(task)
Jan 24 03:41:22 basehw4sn2 python3[439]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Jan 24 03:41:22 basehw4sn2 python3[439]: File "/usr/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
Jan 24 03:41:22 basehw4sn2 python3[439]: return future.result()
Jan 24 03:41:22 basehw4sn2 python3[439]: ^^^^^^^^^^^^^^^
Jan 24 03:41:22 basehw4sn2 python3[439]: File "/home/base/backup-server/software/bcu/__main__.py", line 170, in main
Jan 24 03:41:22 basehw4sn2 python3[439]: await init(cfg["logger"])
Jan 24 03:41:22 basehw4sn2 python3[439]: File "/home/base/backup-server/software/bcu/__main__.py", line 44, in init
Jan 24 03:41:22 basehw4sn2 python3[439]: await pcu.handshake()
Jan 24 03:41:22 basehw4sn2 python3[439]: File "/home/base/backup-server/software/bcu/pcu.py", line 212, in handshake
Jan 24 03:41:22 basehw4sn2 python3[439]: while not (response := await _probe()) == 'Echo':
Jan 24 03:41:22 basehw4sn2 python3[439]: ^^^^^^^^^^^^^^
Jan 24 03:41:22 basehw4sn2 python3[439]: File "/home/base/backup-server/software/bcu/pcu.py", line 205, in _probe
Jan 24 03:41:22 basehw4sn2 python3[439]: return await call_pcu("probe")
Jan 24 03:41:22 basehw4sn2 python3[439]: ^^^^^^^^^^^^^^^^^^^^^^^
Jan 24 03:41:22 basehw4sn2 python3[439]: File "/home/base/backup-server/software/bcu/pcu.py", line 193, in call_pcu
Jan 24 03:41:22 basehw4sn2 python3[439]: with Serial("/dev/ttyBASEPCU", baudrate=38400, timeout=1) as ser: # timeout is critical
Jan 24 03:41:22 basehw4sn2 python3[439]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Jan 24 03:41:22 basehw4sn2 python3[439]: File "/home/base/backup-server/software/bcu/venv/lib/python3.11/site-packages/serial/serialutil.py", line 244, in __init__
Jan 24 03:41:22 basehw4sn2 python3[439]: self.open()
Jan 24 03:41:22 basehw4sn2 python3[439]: File "/home/base/backup-server/software/bcu/venv/lib/python3.11/site-packages/serial/serialposix.py", line 325, in open
Jan 24 03:41:22 basehw4sn2 python3[439]: raise SerialException(msg.errno, "could not open port {}: {}".format(self._port, msg))
Jan 24 03:41:22 basehw4sn2 python3[439]: serial.serialutil.SerialException: [Errno 2] could not open port /dev/ttyBASEPCU: [Errno 2] No such file or directory: '/dev/ttyBASEPCU'
add check for device node and log of warning
modified the config.yaml
(production file! Because it's easier ..) to use backup_testdata_source
and sleep for only 1 minute.
206 runs, the following line appeared 2 times
/dev/ttyBASEPCU not found, retrying ...
since the backup-server kept working, this retrying approach works
Describe the bug
see below
Expected behavior
/dev/ttyBASEPCU
should be available when neededActual behavior
bcu software sometimes crashes when trying to open the serial terminal on
/dev/ttyBASEPCU
. Happens around every 34st run.What happens if we don't solve it (aka why is it important)
the bcu cannot engage the hdd when PCU is not available. Therefore the thing cannot do a backup
To Reproduce
Steps to reproduce the behavior:
doing test which just starts bcu, tries the handshake, shuts down again, sleeps for a minute and so on
Additional context, Environment
during #31
Describe/define the problem
Develop Interim Containment Plan (if necessary)
Determine Root Causes and Escape Points
Pointer to the solution
Actions to prevent recurrence or solve systematic problems
Description