mitchellh / go-mruby

Go (golang) bindings to mruby.
https://gist.github.com/mitchellh/90029601268e59a29e64e55bab1c5bdc
MIT License
471 stars 34 forks source link

func.go: fix concurrent map write #45

Closed erikh closed 7 years ago

erikh commented 7 years ago

This isn't quite ready yet.

Basically, the func tracking we use has a lot of concurrency issues WRT map access. This is a fine-grained locking implementation to deal with it.

The tests still need some work, I have a multi-instance version of erikh/box I'm working on that tests this pretty thoroughly though. I was hoping @mitchellh I could get you to review the implementation before I dive into tests, as this is always tricky stuff.

Note that I wrapped getargs in a bunch of goroutines to try and trigger the issue. It's good at triggering the corruption but not any deadlocking issues which were in an earlier version of this patch. I'm holding off until I can make a better test that would cause deadlocking behavior for doing a lock at the state or class level.

erikh commented 7 years ago

that's a good test I think; I'll leave the args stuff in there too if that's ok, it's a good real-world exercise of it.

erikh commented 7 years ago

ok, maybe it's still broken. :D

erikh commented 7 years ago

Getting some interesting behavior in teh GetArgs test which I think is unrelated but I should note here anyway:

--- FAIL: TestMrbGetArgs (0.08s)
        mruby_test.go:301: code: ("bar", true) {}
                expected: []mruby.ValueType{0x10, 0x2, 0xd}
                actual: []mruby.ValueType{0x10, 0x2, 0x0}

Box has this issue; it occasionally recieves a false when it should receive another value. I suspect that a default int is being used somewhere where it shouldn't be, but I can't confirm that.

erikh commented 7 years ago

I've included a potential fix for the second issue.

erikh commented 7 years ago

bah. gonna set this aside for the day, will revisit tomorrow.

erikh commented 7 years ago

I've committed a test change that (should) fail in travis; it fails locally here. It's a concurrency battery of getargs, and it breaks with a scenario I've seen a few times in box, where the proc is returned to the bound function as false. I think this is due to a bad initializer somewhere as false is 0 in the mruby type system.

I'm looking into this, but if anyone has any time, I would really appreciate it!

erikh commented 7 years ago

This should be passing now. I would advise strongly to reach each commit (and their log messages) carefully when reviewing.

@mikesimons @mitchellh you folks mind taking a look?

erikh commented 7 years ago

Ping since we're past the holidays. :)

mitchellh commented 7 years ago

I read it once but didn't fully understand it yet I plan on giving it another go later. I think I'll have some questions, though.

erikh commented 7 years ago

No worries. Let me know what is confusing and I can add comments and whatnot.

On Wed, Jan 4, 2017 at 2:43 PM Mitchell Hashimoto notifications@github.com wrote:

I read it once but didn't fully understand it yet I plan on giving it another go later. I think I'll have some questions, though.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mitchellh/go-mruby/pull/45#issuecomment-270508836, or mute the thread https://github.com/notifications/unsubscribe-auth/AABJ68ycNknAs1tHSmJP2C9an3SK-h8nks5rPCCDgaJpZM4LQIhD .

erikh commented 7 years ago

So while investigating #49 I realized that this is indeed a related problem to what this patch addresses too.

I strongly believe the fundamental problem is in the unions linked below and the way Go passes structs that are not pointers:

https://github.com/mruby/mruby/blob/master/include/mruby/boxing_word.h https://github.com/mruby/mruby/blob/master/include/mruby/boxing_no.h https://github.com/mruby/mruby/blob/master/include/mruby/boxing_nan.h

As you can see in the last one here the space consumed -- and how it is consumed -- is quite dynamic.

Now I can't remember the semantics 100% (and can't seem to find them quickly either) but when structs in go are passed between functions, I believe they are copied unless they are pointers.

What was happening in the first series of problems was that the mrb_value structs were getting truncated as a result of this, because Go had not taken into consideration the space used. This, combined with dereferencing the truncated value, caused undefined behavior; usually a panic but sometimes it would result in a zero value which would present itself as false to us, the golang consumers as C.mrb_value.

Again, I am not the best C programmer but this is what I observed and the patch linked in #49 does indeed fix his example here.

What I think really needs to happen, instead of the patch I provided to ilya, is a fundamental restructuring of the API to handle mrb_value's cleanly with pointers.

Rules would be:

C calls in gomruby.h would need to be adjusted and externally-facing API would need to be completely overhauled, especially value.go.

Since this is such a significant change, I was thinking I could clean up the patch in #49 with this, tag that as 1.0 or something, and move forward with API changes. People can vendor the tag if they want a working version with the "old" API.

erikh commented 7 years ago

I was double checking my work in the last patch (which is close, but wrong) and noticed these comments in goMRBFuncCall: https://github.com/mitchellh/go-mruby/blob/master/gomruby.h#L31-L34

Definitely related to this discovery. I'm going to see how I can get it to work by making it cleanly shovel around pointers throughout the library.

erikh commented 7 years ago

I've made so much noise here discovering problems with the underlying problem here I think I'm just going to close this to clean things up so we can discuss the patch. I'll also include the fix for #49 in this upcoming PR.

The patch is good, but the things I ran into along the way really confused me and require some thought of other things in the lib I think, but are ultimately unrelated.

Sorry for any trouble caused.