Closed RenWal closed 11 months ago
I will have a look into this. A bit more of RAM is needed, because the compressed file itself is already 43 MB. But this much is a little too much.
I have finished my investigation. Unfortunately this is correct behaviour and I will update the docs to warn the users.
This is what is done in the background each time new weather data is requested (and not only during setup):
Afaik zip files can be processed stream-oriented (with low memory footprint).
The problem is, I have to parse the entire XML file and I am not sure if this will work with a stream.
Tinkered around a bit. This should give you a hint (the i.clear()
part does the trick):
from zipfile import ZipFile
from xml.etree.ElementTree import iterparse
for (_, i) in iterparse(ZipFile(open('MOSMIX_S_LATEST_240.kmz', 'rb'), 'r').open('MOSMIX_S_2023100516_240.kml')):
print(i)
i.clear()
The python
process consumes around 10 MB of memory on my MacBook.
Thanks for the hint. Unfortunately I am using lxml and I am not able to get it working with this. I created a branch stream
if you would like to assist here.
I would like to, but I am really very busy and rusty when it comes to developing – still proficient in the Python language, but I fell up the corporate ladder away from development many years ago. It's really a problem for many HA users to have such a disk space and memory consuming add-on for a typical HA installation (especially it that's really unnecessary), but I am not having serious disk space, memory or CPU limitations in mine.
I haven't set up a serious Python development project (with a working dev environment directly on the git checkout, a build chain which runs the tests and builds the package, an installation to clean venv, test runner, several interpreter versions, …) in years. I just forked your code and couldn't get a project up and running with neither Eclipse/PyDev nor VS Code; it started with broken test files, I have forgotten how to run the tests with pip, etc. I'd really like to help, and if (sadly not when) I find the time and succeed to get something running, I might.
The outlined strategy should work, though. It looks to me as if lxml
is mostly a drop-in replacement for xml
(why did you use lxml
in the first place?). But you will have to change the parsing strategy to an iterative one and drop all stations from the ElementTree
on-the-fly. My little program above takes about 2s on my M1 MacBook.
I'll add this to my list, as I am also a bit busy at the moment.
I used lxml for the implementation of XPath.
Thanks for helping with this. Python isn't my main language, so there was some work to do ;) Best regards!
The new version v2.0.4 should resolve this.
Happy to help. Python used to be my main language. Now that I‘m into it I might help you tidy up the plain Python library (there is still quite a bit of repetitive, hard-tor-understand, entangled stuff left) weh I find some time. For the Home Assistant integration I might need a heads up how to properly test it.
Version of home_assistant
2023.9.3 (Operating System 10.5, Supervisor 2023.09.2)
Version of the custom_component
v2.0.1 (via HACS)
Describe the bug
The bug is triggered by enabling "Force hourly data update" in the UI when configuring the integration. Without that option, the integration works fine. First, the message "Unknown error occurred" appears in the UI. Memory consumption (measured by resident set size) slowly starts rising. About 30 seconds later, Home Assistant Core quickly starts consuming all available memory (from about 800M free to zero in <1 second) and is then forcibly terminated by the kernel. After Supervisor restarts Core, the integration config is gone. There is no error in the logs.
I don't think that this integration should require >800M of RAM for setup (about 3x the memory footprint of my normal HA Core), so I suspect that there is a major memory leak somewhere.
Debug log