Open bobybob69 opened 1 month ago
double #87
@bobybob69 described situation where device in HA don't come up in the morning so it's not duplicate.
I'm waiting for the logs. 😉
hey @davidrapan and @CrazyUs3r thanks for checking , hope you're good !
Yesterday I hope my HA dashboard to check energy production and I didn't get any data (as share on the screenshot). Yesterday I try rebooting the instance but it doesn't change anything. I didn't get opportunity to DL the log as I was on my iPhone.
Today I open again and I can see my inverter is still offline again. I'll see around 12 if still offline but I'm sure it will be.
I was not sure about the similarity with #87 but If you mention it , perhaps it is ?
here are the logs
This is enabled debug log from previous day until morning?
Hi @davidrapan it should but as you ask it mean that it's not the case ?
Here are the pictures of my HA ENERGY dashboard just now
We can see the entities are offline
What would be the best to help you troubleshoot ?
I really ha no idea what is happening.. I will need a moment to think about it. 😉
Hi @bobybob69, did you for example just try to hit the Reload button under that 3 dots menu on the list with Solarman devices?
hi @davidrapan , yes I try and bellow are the logs
As it's overnight, inverter aren't producing anything. But they can't be reachable.. it's strange and that's the issue that occurred . Is it expected the inverter to not be reachable when their is no production ?
I'll try again tomorrow morning just so the log record the changes
thanks for helping mate
Yes microinverters are turning off when there is no sunlight.
Hi @davidrapan you good ?
Bellow are the logs and what I'm seeing on the entities .. I don't know why they all turn as unavailable . Does a rename of the entities can cause an issue ? Otherwise the issue for the inverter turning off and on but not on HA .. no idea why It happen
Any though ?
Thanks mate ! home-assistant_solarman_2024-10-13T10-12-41.204Z.log
Did you tried that reload button when it gets into this state?
Hi @davidrapan yes I click the reload button but it stay unavailable. I re-press this button just now and it goes back online with the data. Also, I'll check tonight when they'll go offline if when buck on in the next day, data canes back normally. I activate the logs and will share then with you tomorrow
hey @davidrapan
Here are the logs , for 2 days of works and right now here's what I'm seeing : error message everywhere .
If I clicked reload, it works back as expected
home-assistant_solarman_2024-10-14T16-01-30.348Z.log
after clicking the reload button
This behavior is honestly really weird and I can't think of anything we could try to reveal what's going on... :-/
Hi @davidrapan , just to let you know, is happen again this morning.. the inverter goes offline from the integration. It's strange because I was using Stephane Joubert integration and I didn't get theses issues.. what could cause it to happen ? I re-enable the logs and will share them later. Yesterday they were offline due to non-production, and when they produce back , integration show the error.
What would you need as infos to have better context to understand what can cause the issue ? Only the logs are enough ?
Thanks and have a great day
Hello @githubDante, do you maybe have any idea (cause I'm out of them) of what could be wrong here?
This is really bad:
OSError: [Errno 24] No file descriptors available
It's an indication for FD leak somewhere. The question is who is causing it. It can be this integration, but it could also be something HA related (e.g. other modules).
What happens at night when these micro inverters are offline ?!? Retries until successful connection or something else ?
Ou I did not notice that OSError... That truly is bad.
What happens at night when these micro inverters are offline ?!? Retries until successful connection or something else ?
Yes. Retries.
There are quite few of users with microinverters which also go offline during the night but do not experience this issue.
Maybe they don't have so many inverters. There are at least 3 here.
@bobybob69 can you provide a log for the interval between e.g. 18:00PM and 07:00AM, or an extended log for 24 hours or more.
Maybe they don't have so many inverters. There are at least 3 here.
Yeah that's true though.
Isn't there any way how we could easily reuse sockets?
No, they must be released. The good news is that the issue is not caused by the integration/pysolarmanV5, it must be something else in the @bobybob69 installation that leak FDs (not necessary network related).
How I know that the issue is elsewhere - with HA in a container and several fake inverters with different addresses of running hosts (one is connected to a web server on port 80 :smile: and it's very noisy ) in it, then I monitor the connections and their states.
@bobybob69 what's the output of this command:
ls /proc/`ps xalf | grep hass | grep -v grep | awk '{print $3}'`/fd | wc -l
How many are network connections ?!?
lsof -i -a -np `ps xalf | grep hass | grep -v grep | awk '{print $3}'` | grep TCP
or with ss
:
ss -ntp | grep `ps xalf | grep hass | grep -v grep | awk '{print $3}'`
No, they must be released. The good news is that the issue is not caused by the integration/pysolarmanV5, it must be something else in the @bobybob69 installation that leak FDs (not necessary network related).
I also ran a test with one real and three fake inverters and came to the same conclusion...
This is really bad:
OSError: [Errno 24] No file descriptors available
It's an indication for FD leak somewhere. The question is who is causing it. It can be this integration, but it could also be something HA related (e.g. other modules).
What happens at night when these micro inverters are offline ?!? Retries until successful connection or something else ?
hi @githubDante , tonight I was trying something to integrate my smart meter and I had to restart HA instance, when back on, I go on the solarman and the inverter are offline (as there is no production)
bellow are the log after reloading the solarman instance for each devices sorry I didn't notice your message earlier
home-assistant_solarman_2024-11-02T21-31-55.556Z.log
here's the actual looking of the solarman integration for my inverter and the smart meter I'm trying to integrate on #187 with @davidrapan
No, they must be released. The good news is that the issue is not caused by the integration/pysolarmanV5, it must be something else in the @bobybob69 installation that leak FDs (not necessary network related).
How I know that the issue is elsewhere - with HA in a container and several fake inverters with different addresses of running hosts (one is connected to a web server on port 80 😄 and it's very noisy ) in it, then I monitor the connections and their states.
@bobybob69 what's the output of this command:
ls /proc/`ps xalf | grep hass | grep -v grep | awk '{print $3}'`/fd | wc -l
How many are network connections ?!?
lsof -i -a -np `ps xalf | grep hass | grep -v grep | awk '{print $3}'` | grep TCP
or with
ss
:ss -ntp | grep `ps xalf | grep hass | grep -v grep | awk '{print $3}'`
@githubDante , I try from the terminal menu of home assistant, and bellow are the result (all seems to fail, except the second, I press enter but nothing happen..)
anything I could help with to troubleshoot ? thanks
Hi,
The name of the main process is not hass
in your installation. Try to identify it and use it in the grep command
ls /proc/`ps xalf | grep <process name> | grep -v grep | awk '{print $3}'`/fd | wc -l
and
lsof -i -a -np `ps xalf | grep <process name> | grep -v grep | awk '{print $3}'` | grep TCP
If you know the PID you can use it directly:
ls /proc/<PID>/fd | wc -l
and
lsof -i -a -np <PID> | grep TCP
hey @githubDante , any tips to know how can I know which process I should look at ? and same for the PID ? Sorry it's not familiar for me here, but ready to know how to !
Please found below the logs after the solar production start , all the inverter get back online. But I had to manually refresh the configuration from each integration for the inverter.
Thanks for your help mate !
How is your HA installed?
BTW, you are from the future? Cause your latest posts says "bobybob69 commented in 30 minutes"! 😆
You can use ps xalf
to list all processes or to scan manually /proc/*/comm
& /proc/*/cmdline
with ls -l
& cat
in order to find it. Considering the fact that this is some tiny system (using busybox
) you should not have many processes, especially python3.12 related.
The last log shows something which is definitely related to the FD leak issue. The ics_calendar
extension is behaving rather funky. It starts here:
2024-11-02 22:24:02.142 ERROR (SyncWorker_4) [custom_components.ics_calendar.calendar] Schedule Apple Loris: Failed to open url...
continues with:
(error count: 4 - this error is ratelimited)
and then its connection limit is getting exhausted:
2024-11-02 22:24:08.916 WARNING (SyncWorker_4) [urllib3.connectionpool] Connection pool is full, discarding connection: p139-caldav.icloud.com. Connection pool size: 10
Another limit is reached here:
2024-11-02 22:24:53.112 WARNING (MainThread) [homeassistant.components.homekit] Cannot add climate.clim_mael as this would exceed the 150 device limit. Consider using the filter option
The OS errors OSError: [Errno 24] No file descriptors available
start 20-30 minutes later while the ics_calendar
still tries to open that URL.
The tests performed by me & @davidrapan on a clean install show no issues with this integration, so the root of the issue must be in another extension. Try to disable them one by one and you should find the culprit.
hey @githubDante , just checked on terminal, what I did few month ago is to replace the RPi and in the mean time I did a clean install + backup restoration from my previous HA installation. Does this can cause issue ?
Please found bellow results for ps xalf
bellow are the list for command busybox --list
does this help ?
Are you suggesting to clean install HA and re-install module one by one ? maybe that could solve the issue ?
thanks for your help
does this help ?
No.
Are you suggesting to clean install HA and re-install module one by one ? maybe that could solve the issue ?
I'm not fammiliar with HA/HA OS at all, but yes, start from scratch or disable/uninstall the modules/integrations which you do not use. Does this ics_calendar
even work for you?!?
I don't understand why you just don't try to remove devices from these other integrations (or even remove them completely) which are running there. It does not look like they even work so... it's really no brainer.
Hey guys @githubDante yes the ics calendar works but if it need to be deleted it will not be a problem to do so.
What should I do to help so ? From where the command should be executed ? I'm sorry to not be as efficient as you would 😅
@davidrapan which devices should I remove that you suppose are set incorrectly? I'm not sure to understand
According to the log ics_calendar have or is causing some issues for example.
We told you, try start disabling some integrations (from HACS) one by one until the problem with solarman disappears...
Something in your HA is exhausting resources and thus causing issues which results in solarman not working... 😉
Describe the bug When Deye inverter M200G4 get offline, impossible to have data live. All entities get offline. Need to re-load the integration
Attach the debug log Will be uploaded soon
To Reproduce Just after overnight when first sunlight appear and panel produce energy, nothing happen on the HA app, however everything is live on the SOLARMAN app.
Expected behavior Data should be back online even after inverter get offline during overnight.
Screenshots
Energy dashboard not showing anything for todays
Production from the SOLARMAN app for the same day of energy dashboard
Entities offline except one "Total Production 4", don't know why...
Metadata: Version: v24.10.04