Closed OTP-Maintainer closed 3 years ago
lukas
said:
Is this a new fault in 21.2? or does the same thing happen in 21.1.x?
dch
said:
I moved from 20.something directly to 21.2, so I can't really confirm. If that's helpful I can re-test after checking if latest OTP release still has the issue.
dch
said:
Lukas this still seems to occur on OTP 21.2.2 :-( I'll run this over the weekend for a while and report back if the coredumps are any different to what's already posted. BTW those are available privately if helpful.
lukas
said:
Yes, the next step will be to look at the core files. This is some sort of memory corruption fault. Something is doing a double free or buffer overflow which causes the allocators to become very sad.
I noticed that one of the core files segfaulted in this nif: https://github.com/apache/couchdb-khash/blob/master/c_src/hash.c
Please make very very sure that there is no problem in that nif.
davisp
said:
@lukas While I've not done a formal proof on that NIF, I've not seen it cause segfaults in years of abuse on multiple VM versions so I'd be fairly surprised if it were causing the issue. Granted there could always be some change in undefined behavior for 21.x that it was relying on but it's not doing anything fancy so I'd be fairly surprised to find that was the cause.
lukas
said:
In OTP-21 we added some extra statistics to memory allocation as described here: http://blog.erlang.org/Memory-instrumentation-in-OTP-21/. That could very well expose bugs in nifs that would have gone undetected before.
If you could send me (lukas@erlang.org) links to the core + beam.smp executable I can take a look and see if something obvious pops out.
dch
said:
erts is rebuilt with -g flags, waiting on more logs & updates via email
lukas
said:
Fault most likely found. Anyone who has encountered the same issue can try this patch: https://github.com/garazdawi/otp/tree/lukas/erts/fix_inet_multitimer_cleanup/OTP-15536
dch
said:
Sweet, sweet patch. This has been running without issue (after rebasing off OTP-21.2.3 as well) for 5 days. Thanks Lukas!
Original reporter:
dch
Affected version:OTP-21.2
Fixed in version:OTP-21.2.4
Component:erts
Migrated from: https://bugs.erlang.org/browse/ERL-827