container kills fhem and stops because of timeout in entry.sh

roobbb commented 2 months ago

Describe the bug I migrated the image from v3 to v4.0.1. My FHEM setup takes about 2 minutes to go up. Inside the v4 script 'entry.sh' is set a timeout of 60s by declare -ri TIMEOUT_STARTING=${TIMEOUT_STARTING:-60}. → https://github.com/fhem/fhem-docker/blob/dev/src/entry.sh Unfortunately my setup takes longer, so the script stops FHEM and the container, almost before FHEM could finish its start (what causes a loop because of the '--restart=always' parameter). The docker log says:

...
ERROR: Fatal: No message from FHEM that server has started.
INFO: Sending SIGTERM (equivalent to "shutdown" command) to FHEM (pid 12119).
INFO: Waiting up to 30s for FHEM process (pid 12119) to terminate.
INFO: Stopping container. Bye!

When I set the timeout to 120s, FHEM comes up as usual and the container keeps running fine without any restart/ stop.

To Reproduce Steps to reproduce the behavior:

have a large setup or some modules slowing down, what takes in summary longer than 60s to go up/ start
start container
wait for timeout event
See error by docker logs <container> | grep 'INFO\|ERROR
set the timeout to a higher value (e.g. 120s) and restart

Additional context

runs on a RaspberryPi 3

Note My FHEM setup is not that large. There are nevertheless some official FHEM-modules which seem to take a relative long time to load. So I have no direct influence. And perhaps a Raspi3 is not one of most performance machines 😁 😇

Hello.

Would it be possible to set the variable TIMEOUT_STARTING to 120s or a bit higher by default?

Thank you very much and kind regards roobbb

sidey79 commented 2 months ago

TIMEOUT_STARTING can bei controlled via Docker environment variables as a workaround.

I tested the image on a Raspberry Pi 2 and IT Takes only a few Seconds.

Do you have any idea, which Module takes much time?

roobbb commented 2 months ago

Hi there.

Well, such things like uba (60_uba.pm). Only this one eats 29s.

...
2024.06.10 21:04:47 3: From the FHEM_GLOBALATTR environment: attr global logfile log/fhem-%Y-%m-%d.log
2024.06.10 21:04:48 3: ESPEasy espBridge: Bridge v2.18 port [TCP:IPV4:8383] opened.
2024.06.10 21:05:17 1: FHEM::Meta::InitMod: ERROR: $@:
60_uba.pm: Error while parsing META.json: , or } expected while parsing object/hash, at character offset 629 (before ".2, \n        "Meta"...") at FHEM/Meta.pm line 1516.
...

I guess it won't be fixed (https://forum.fhem.de/index.php?msg=1242375).

Further I have a telegram bot running, what eats 7s:

...
2024.06.10 21:04:19 1: TempSensor01_Hobbyraum: no I/O device
2024.06.10 21:04:26 3: TelegramBot_Define HomeBot: called 
2024.06.10 21:04:26 1: PERL WARNING: Use of uninitialized value $a[6] in lc at ./FHEM/10_IT.pm line 894, <$fh> line 478.
...

The rest summarizes up to about a minute, but there are only 1 or 2 seconds between each step.

Thank you for your hint using the environment variable as a workaround. I didn't think about it while I was looking for the cause of my problem. At first I thought the health check could cause the stops, but it wasn't 😅

Yes, I also had a bare FHEM for my testing what even started in a few seconds. Just my real system has a lot more entities defined and shows a different behavior.

Thank you for your support and best regards roobbb

sidey79 commented 2 months ago

I added description in the readme, i think this is a very rare case that FHEM takes that long so start.

fhem / fhem-docker

container kills fhem and stops because of timeout in entry.sh #243