Closed ArjenR49 closed 7 months ago
Sorry for the formatting. GitHub did that, and I forgot/couldn't find the way to prevent it...
I'll upoload the file to my GitHub repository: https://github.com/ArjenR49/UPS-Plus
friend, select all code text you want and click on <> button. ( ctrl + e )
In this script I read the value back after it was written. If the result is different from what was supposed to be written, it is written again until OK. Like driving a nail with a hammer ;-) Time.sleep here and there doesn't do much.
Please if you can, run this and see for yourself. Every once in a while the write operation fails and the previous value remains, which is when you have to write it again. I can only guess what causes this. One suggestion would be timing errors on the SMBus. Is that caused by the hardware? I don't know, but there seems to be a way around, write until i2c gets it right ;-)
Have a script ready to run after you stop this test script. If you don't restore 0x18 to a sensible value (=0), you may be in for a surprise after a few minutes ...
`#!/usr/bin/python3
# -*- coding: utf-8 -*-
# ar - 18-05-2021
# import os
import time
# import smbus2
from smbus2 import SMBus
# Define I2C bus
DEVICE_BUS = 1
# Define device I2C slave address.
DEVICE_ADDR = 0x17
def settle():
time.sleep(0)
# Write byte to specified I2C register address
def putByte(RA, wbyte):
while True:
try:
with SMBus(DEVICE_BUS) as pbus:
pbus.write_byte_data(DEVICE_ADDR, RA, wbyte)
with SMBus(DEVICE_BUS) as gbus:
rbyte = gbus.read_byte_data(DEVICE_ADDR, RA)
if (wbyte) <= rbyte <= (wbyte):
print("OK ", wbyte, rbyte)
break
else:
# if rbyte < max((wbyte - 2),0):
raise ValueError
except ValueError:
print("Write:", wbyte, "!= Read:", rbyte, " Trying again")
while True:
putByte(0x18, 180)
# settle()
putByte(0x18, 0)
time.sleep(1)
# EOF`
A failing write on a timer register makes for unwelcome surprises, so I now have all writes to SMBus in my scripts changed into 'write, read back, compare, and if needed write again until it sticks'. I use a version of the above putByte() function. This way I can be sure that registers always get the value they are supposed to have. I had really persistent problems caused by failing writes to timer registers.
This problem is actually due to the fact that there are two cron tasks that are also reading and writing registers, if necessary the best way to do this is to pause the cron task, or check if it is in the middle of a cron task, or modify the cron task script with a lock.
I don't have two cron tasks related to UPS Plus running at the same time. Which tasks do you mean? I have only upsPlus.py every minute with a file lock (flock ....) and when I want to run another program, like a test script for instance, I stop upsPlus.py. On my PowerCycle.py I have the same lock, so that will wait and not start when upsPlus is executing. I don't run upsPlus_iot.py on a schedule at all, since it sits and waits for up to a minute.
My tests don't last so long that the batteries would run out, and also the charger is connected all of the time, so I don't need to run upsPlus.py at all, since I am in control.
I can read the register contents with my UPS_report.py and see at least what the values of the countdown timers are at any moment, so that I don't experience a sudden shutdown three minutes after I stop the test script, which may last have written 180 to a countdown timer register.
But running CatchExceptions4.py I could find out that I need to read back what I attempt to write to a countdown timer and see if the write has been succesful. If not write again ... The following may also work: write 0 first to the countdown register when after one minute it is going down at120 and only then write the 180 again (you suggested 120) for the watchdog function. That may work reliably, too. I never tried this on its own. I always combined that with write, read back and if different write again ...
When I had started to use frtz13's UPS script fanShutDownUPS.py I learned to check error messages with journalctl -f. I had regular i2c errors which are now all handled by frtz13's script which one can see in the log from journalctl. So far so good. At that time I was running MotionEye and of course had a camera attached to the Pi4B with the UPS.
For testing Home Assistant and MQTT and Docker I changed to clean installation of Raspberry Pi OS without MotionEye. I didn't get anything out of HA and MQTT so I removed those. The camera is still connected to the Pi, and enabled, but not working. In this situation I have not seen any i2c timeout errors in the log. No unhandled - which would have crashed the script - nor handled errors either, of which the log would show which part of the script they occurred in.
So, it looks like MotionEye has a hand in the frequent i2c errors that I saw earlier.
Of course this means that exception handling is essential. Unhandled exceptions crash the execution of the script wherein they occur. upsPlus.py may run again in a minute as opposed to fanShutDownUPS.py which runs in the background, but crashing of upsPlus may also not be without consequences.
I don't have two cron tasks related to UPS Plus running at the same time. Which tasks do you mean? I have only upsPlus.py every minute with a file lock (flock ....) and when I want to run another program, like a test script for instance, I stop upsPlus.py. On my PowerCycle.py I have the same lock, so that will wait and not start when upsPlus is executing. I don't run upsPlus_iot.py on a schedule at all, since it sits and waits for up to a minute.
My tests don't last so long that the batteries would run out, and also the charger is connected all of the time, so I don't need to run upsPlus.py at all, since I am in control.
I can read the register contents with my UPS_report.py and see at least what the values of the countdown timers are at any moment, so that I don't experience a sudden shutdown three minutes after I stop the test script, which may last have written 180 to a countdown timer register.
But running CatchExceptions4.py I could find out that I need to read back what I attempt to write to a countdown timer and see if the write has been succesful. If not write again ... The following may also work: write 0 first to the countdown register when after one minute it is going down at120 and only then write the 180 again (you suggested 120) for the watchdog function. That may work reliably, too. I never tried this on its own. I always combined that with write, read back and if different write again ...
please show me this command's output : crontab -l
Because my UPS control suffers from persistent errors concerning the power control/timer registers no matter how I try to add exception handling, I made the short script below to get a better picture. I run it with no other scripts else concerning the UPS, but I still get wrong results from writing and then reading back the same register even with time.sleep(0.5) before and after. The register I test is one that it should be possible to write to again and again acoording to GeeekPi.
I am at my wits end with these errors, really. If you can, stop upsPlus.py in cron and run this script on your UPS/Pi and report your results.
This is a sample output from the script below run in Thonny:
The script: