crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.41k stars 1.62k forks source link

Weird bug with double pointers and GC #10643

Open I3oris opened 3 years ago

I3oris commented 3 years ago

Hello,

I got troubles by playing with double pointers. After ask some questions on the forum to be sure to use pointers correctly, I have reduced my code until the following result:

I allocate a double pointer, set the value to 42, and get it 100000 times. However, after 62953 iterations (in my case) the value isn't correct any more:

double_pt = Pointer(UInt64).malloc
double_pt.value = Pointer(Int32).malloc.address

Pointer(Int32).new(double_pt.value).value = 42

1000000.times do |i|
  Pointer(Int32).malloc # ramdom malloc

  value = Pointer(Int32).new(double_pt.value).value
  raise "BUG at iteration #{i}!" if value != 42

end

Reproducible on carcin here This bug is really weird because some inocent change cause the bug disappear:

This bug isn't random because subsecant build cause each time the bug at the same iteration (this seem to be platform dependant, or depend of the state of the memory I don't know)

So it look like of a bug of the GC !? :thinking:

crystal -v:

Crystal 1.0.0 [dd40a2442] (2021-03-22)

LLVM: 10.0.0
Default target: x86_64-unknown-linux-gnu

uname -a:

Linux X 5.4.0-66-generic #74-Ubuntu SMP Wed Jan 27 22:54:38 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
jhass commented 3 years ago

This probably needs to be documented better but Pointer.malloc does an "atomic malloc" for types where the compiler believes that they contain no references: https://github.com/crystal-lang/crystal/blob/3536a90f5f476929efb03ff28a1940607889262e/src/compiler/crystal/codegen/primitives.cr#L733-L737

An "atomic malloc" instructs the GC to not look for pointers in there. If we switch out your example to bypass the atomic malloc it no longer breaks: https://play.crystal-lang.org/#/r/awxc

straight-shoota commented 3 years ago

This doesn't look like a bug. It's just the GC doing its job and as far as I can see, totally expected behaviour.

When you allocate a UInt64 pointer with Pointer(UInt64).malloc, the GC is not aware that that UInt64 is supposed to be a memory address. Pointer.malloc uses GC.malloc_atomic by default when the allocated type does not contain inner pointers. The type is an integer primitive and the compiler can't know that you intent to use it to store an address. Because you allocate more memory each iteration, at some point the GC needs to collect and since it doesn't see any references to the Int32 pointer, it frees the memory.

To mitigate this, you can use a pointer type instead of UInt64. Then the compiler is aware that the value is an address and doesn't use malloc_atomic. Alternatively you can use GC.malloc directly to prevent malloc_atomic: GC.malloc(sizeof(UInt64)).as(UInt64*).

jhass commented 3 years ago

Also if we properly declare the type of the double pointer as Int32** it no longer breaks: https://play.crystal-lang.org/#/r/awy7

And of course we could also drop the boilerplate then: https://play.crystal-lang.org/#/r/awy8

I3oris commented 3 years ago

Thanks you for yours answers! Thinks become clear now, I didn't knew that Pointer.malloc was atomic.

And of course we could also drop the boilerplate then: https://play.crystal-lang.org/#/r/awy8

Yes of course, my example was boilerplate, it was to keep the "bug" appended :).

asterite commented 3 years ago

I think we should never use malloc_atomic under the hood. There's no proof it leads to faster code.

asterite commented 3 years ago

Looks at this:

https://github.com/crystal-lang/crystal/pull/4081#issuecomment-283625197

I was hesitant to use malloc_atomic under the hood precisely because it could lead to pretty obscure and hard-to-trace bugs, exactly like what's happening here.

Here's my proposal on how to move forward:

  1. Let Pointer.malloc always use GC.malloc
  2. If someone knows their allocated memory won't have pointers, they can use GC.malloc_atomic (let's document that)
  3. Find places in the standard library where we can use GC.malloc_atomic, and use that instead of Pointer.malloc

The 3rd point seems like too much manual work, but if we don't do it then things continue working as usual. If we do that optimization, then it still works fine, but maybe faster (as far as I know nobody did any benchmark to prove that malloc_atomic is significantly faster than malloc: I think the only case where that could happen is when you allocate a really big chunk of memory, but that's not very common.)

Finally, we could introduce a macro method to be able to ask if a type has interior pointers. All primitive types would return false, but for example a struct containing only primitive types would also return false. Then we can make Array, Slice and Hash check this to use malloc_atomic when possible (we can start by doing this for primitive types only, without the new macro method.)

I think Crystal should avoid surprises like this whenever possible, and favor correctness over "efficiency but sometimes incorrect code".

straight-shoota commented 3 years ago

I don't think this needs to change. IMO the current behaviour is totally fine. When specifically asking to allocate memory for an integer, there's no reason to expect that this memory could be used for storing an address. Pointers are low-level API and there are many other possibilities to shoot yourself in the foot by improper usage.

There are plenty of ways to do this right (allocate a void pointer, a double pointer or use GC.malloc directly). This just needs to be properly documented.

Sure, if there happens to be no performance difference, we could certainly avoid the malloc vs malloc_atomic thing. But I don't follow the reasoning to just assume there would be no performance impact just because nobody has shown it yet. I think there are good reasons to expect that using malloc_atomic whenever possible is better. The performance is not so much relevant at allocation where I suppose the only difference is that malloc clears while malloc_atomic does not. But the GC can skip memory allocated with malloc_atomic when tracing references.

I3oris commented 3 years ago

Sure I think just document that Pointer(PrimitiveType) can't store an other address would be fine :slightly_smiling_face:

asterite commented 3 years ago

When specifically asking to allocate memory for an integer, there's no reason to expect that this memory could be used for storing an address.

Well, that was the expectation here, and that's why it broke.

My expectation is that people don't read documentation, so even if we document it people will run into this.

asterite commented 3 years ago

That said, I'm fine with documenting the current behavior. We could always change it later on and it will be backwards compatible.

I3oris commented 3 years ago

I've got still some troubles even using GC.malloc instead of Pointer.malloc, but I've found the problem: addresses must be aligned by 8 to be detected by the GC. So here is still possible to got memory corruptions. I thinks this point could be documented too :D

jhass commented 3 years ago

Maybe there's some good document in bdw-gc we can link to for such details?

I3oris commented 3 years ago

Maybe https://github.com/ivmai/bdwgc or https://github.com/ivmai/bdwgc/blob/master/doc/overview.md ? Though I haven't seen the info about alignment.

BrucePerens commented 3 years ago

Image processing memory can be very large and should not be scanned by the GC. There should be a well-defined way to allocate it.

asterite commented 3 years ago

This bit me today. I was using Pointer(UInt8) to store bytes that reference objects in memory in one place, and Pointer(UInt64) to store object ids that I didn't want the GC to collect, but it doesn't work.

I really think we should be conservative here. It's no joke getting a crash because some objects you thought would be referenced are not. If you really want malloc atomic, you should use GC.malloc_atomic.

rdp commented 2 years ago

Maybe introduce a new method "Pointer.alloc_pointer_to_object" or something, so people can realize they're not allocating pointers by default, and so it can warn them about alignments being necessary? Though the documentation makes this a lot clearer, thanks!