Closed CyberShadow closed 2 years ago
We definitely would benefit from better tools to debug such problems; though, in this case, an educated guess might have sufficed.
Based upon your description of the debugging, if we could get the GC to free ASAP and not reuse memory, then ldc's address sanitizer should do the job for this problem.
if we could get the GC to free ASAP
To do that, you would need either something like reference counting or performing a full mark&sweep every statement.
That's not what I meant. I meant instead of setting it as free internally when ready to free, just free that memory range. So that ASAN knows that the memory should not be accessible.
In this case, that wouldn't help at all. You would just get the ASAN violation instead of the UTFException
, in the exact same circumstances, and with no other clues to help.
I wonder how it would be possible to avoid such an issue in the future? Should all custom allocators forbid allocating data with indirections unless GC addRange/removeRange has been added properly?
Sure, that's one way.
I suggested another way (only allow non-GC pointers, and verify that this is true at runtime in debug builds).
Could we add interfaces or patterns that could enforce that statically?
It would certainly be possible with a custom pointer type, which indicates that the memory is managed manually and not owned by the GC, and having the allocator types only accept values which have no indirections except via this wrapper. But I can't predict if the syntactic overhead would be bearable.
I don't think we can generally achieve memory safety using static checks at this scale, without becoming Rust. :)
need someone else to merge, can't merge until travis is done, which I think we can wait for forever since travis-ci.org is gone.
Did the force push, and made travis-ci
optional so it won't block you in the future.
Following up to the DLF meeting yesterday. I don't really know much about libdparse's memory handling philosophy to know if this is an idiomatic fix, but it does definitely fix https://github.com/dlang-community/libdparse/pull/387#commitcomment-44638309 by addressing the root cause (GC pointers in non-GC memory).
An alternative approach would be to add
debug
checks which walk every type added to one of thestdx.allocator
-based constructs, and assert at runtime that all pointers within the value don't point into GC-owned memory; and, of course, actually moving all data to allocator-owned memory (which in this case probably means interning the doc string).In this case, the GC-owned memory was allocated here:
And the last point where it was still seen by the GC was here:
For the record, here is the strategy I used to pinpoint the problem:
Disable ASLR (determinism)
Disable GC threads (more determinism)
Fast builds: disable optimization in Phobos/Druntime, build only shared libraries, link dscanner with shared library
Find the address of the lost memory, i.e. the buffer that ends up containing bad UTF8 (add
writeln
before the code that throwsUTFException
)Add GC hook to trap when the address is allocated and when it's freed
LeakDetector
. This is still currently a bit more complicated than necessary, will upstream some fixes soonasm{int 3;}
and ran in GDB when I needed a stack trace.Examine the lifetime of all values that share the problematic address
In this case, it was:
std.array.appender
buffer is allocatedstd.array.appender
buffer is wrongly freedappender
buffer is accessed, causing the UTF errorBisect the point in the program's execution where the value is last visible in the GC.
The general strategy is that you want to call
GC.collect
as frequently as possible (though not that frequently that the program now takes forever to run). I did this in three steps:fullcollect()
toGcx.alloc
)rt.trace
, and callingGC.collect()
when the call counter is within the bisected certain range)GC.collect()
calls in Dscanner's code in functions identified by the above step.We definitely would benefit from better tools to debug such problems; though, in this case, an educated guess might have sufficed.