Open pemensik opened 5 years ago
This looks like a problem that is fixed in one of these issues. https://github.com/NLnetLabs/unbound/commit/474afc9016d34a98537a97cc94e14d329c7d8aeb https://github.com/NLnetLabs/unbound/commit/c6369e9ffa59eb5a9f714f57810ab6ed389866b7 specifically the first one one address type that could cause it to fail.
So, updating to the code repository version may fix that problem. It is the auth-zone code, and not really dnssec-trigger. dnssec-trigger is also active I guess, during network start. Thanks for the stacktrace and report, by the way.
In any case, the new https://github.com/NLnetLabs/unbound/commit/c26fc8494538cba37ad15f13c1e80cb48fea3d0b auth zone logging code could be useful to diagnose the trouble more clearly if the existing fixes do not cover the issues you are facing. They log at verbosity high levels, with more information why unbound is doing lookups. (edit to fix commit link)
Thanks for points to release. Unfortunately, it crashed to me again recently with version containing all those commits. Again on root zone check. Backtrace on bugzilla comment
Added a new commit that fixes the scan_addr that is referenced in the memmove from the backtrace, so that it is zeroed when the list is freed. That should stop that from getting used if the lookups fail and it points to the result of a previous scan. But I did not reproduce the issue, so although this fixes elements from the failure and stacktrace, I cannot say for sure. Thanks for the detailed stack traces, by the way, those are very helpful.
It seems this issue is related to some action done by dnssec-trigger we are using. It happens sometime on Fedora, bug #1667387. We were not able to find a reason for it, maybe you could help us?
Happens often in version 1.8.3, but I think it happens sometime in 1.9.x sometime.