bdew-minecraft / pressure

Pressure Pipes mod
18 stars 10 forks source link

Don't deadlock with OpenComputers #56

Closed habnabit closed 8 years ago

habnabit commented 8 years ago

The previous implementation of waitForFuture under OpenComputers was to sleep until the future had a result, but that can cause a deadlock.

The specific deadlock I ran into was that the server thread was saving the world, which requires a lock on the OpenComputers machine. At the same time, the OpenComputers worker thread had acquired the same machine's lock, and was sleeping while waiting for the server to set a result on the future.

Since raising LimitReachedException is effectively just saying 'try again later', there's no harm in doing it this way.

I know you were raising this exception before and then stopped, but it turns out you kind of have to. Sorry!

bdew commented 8 years ago

This solution just breaks it in a different way. Throwing LimitReachedException makes OC redo the call in server thread - where we can't sleep. If the future is not done the only thing it can do is throw an error that will be returned to lua code.

Everything just fails randomly all the time making the interface pretty much useless.

I honestly don't see a good way out of this. I'm tempted to either:

habnabit commented 8 years ago

Throwing LimitReachedException makes OC redo the call in server thread

I don't think this is true. My understanding from reading the code and asking about this failure on the #oc IRC is that throwing LimitReachedException is how you indicate you're yielding, and it'll be retried later in a worker thread again.

bdew commented 8 years ago

The documentation here says:

... The limit itself is enforced in the machine's invoke method, which will throw a LimitReachedException once there were too many call. At that point the architecture should either perform a short sleep, or fall back to perform a synchronized call instead.

(in this context synchronized = in server thread, direct = in worker thread)

So i guess it can go both ways? I couldn't find the relevant code in OC, been a while since i looked at it.

At any rate, even if it sleeps it's still not good. An exception is not a yield, and the machine will just repeat the same call so i have no easy way to link it to the previously generated Future. So a new Future will be started, which won't finish because waitForFuture is called literally in the next line after the future is started - and it will just keep failing forever.

habnabit commented 8 years ago

An exception is not a yield, and the machine will just repeat the same call so i have no easy way to link it to the previously generated Future.

Ah, yeah, I think this is the fundamental problem. I don't know how I didn't see that!

For reference, almost all of my async programming experience is with a single-threaded event loop model, so doing things like spawning threads and blocking on futures is, for now, still outside my reasoning ability.

What kind of fancy concurrency stuff did you even want to do, though? Skimming through the methods currently exposed on data ports, I'm just not sure what you'd have in mind. It seems like it might be more straightforward to remove all of the bits about futures entirely. But, from looking over the wrapInFuture definition, aren't you running all your code in the server thread anyway? Maybe you just want to force all calls to be synchronized? In that case, the fix would be even simpler: raise LimitReachedException any time you see a direct call.

hron84 commented 8 years ago

Just an idea: why not ask from @fnuecke about how OpenComputers works in this term?

fnuecke commented 8 years ago

@bdew is correct, LimitReachedException will cause the method to be called again from the server thread, so that's not really an option when waiting via sleeps.

Without knowing the exact use-cases for these futures, might I suggest relaying the waiting to the Lua side of things instead? That would avoid locking up a worker thread in OC / potential too long without yielding errors in CC.

A relatively clean way would be to pass a future-like object to Lua (in OC userdata is implemented using the Value interface (and AbstractValue base class for convenience), in CC there's ILuaObject).

On the Lua side you could then have something like

local future = blah.yourCallback()
while not future.isCompleted() do
  [os.]sleep(10)
end
-- use future.get() or whatever
habnabit commented 8 years ago

There, I did it.

It's backward-incompatible, but that's an implementation to discuss/iterate on, at least. Is it possible to make this backward-compatible while also not totally breaking on OpenComputers?

bdew commented 8 years ago

Just for clarification - am i missing something or PersistableFutureState &co doesn't actually persist the result in any way, even if the future is completed?

I have a preliminary solution that I'm testing and fixing some small leftover bugs and will push it to the public repo in a few hours. I'm sorry that i stalled on this for so long.

Thanks to everyone that contributed, especially @habnabit - while i didn't end up using any of your code, it gave me alot of inspiration to fixing this mess :P

habnabit commented 8 years ago

@bdew I wasn't sure about the best way to do it, and finished the implementation of that after pushing everything else. I'll push the rest now.

bdew commented 8 years ago

So hopefully i've fixed all problems with OC without breaking anything in CC.

Beta build available here: http://minecraft.curseforge.com/projects/pressure-pipes/files/2276873

I've closed all related issues, if there are any unsolved or new problems or other concerns - comment here or open a new issue.

Thanks again to all who contributed.


The gory details for those interested:

habnabit commented 8 years ago

Nice! This seems like a much more sensible solution. Still learning a lot about minecraft/scala dev.

hron84 commented 8 years ago

@bdew if the beta commits land in release, will you enable Tank Data Ports if only OC installed?

bdew commented 8 years ago

They already should work with only OC in the beta (if they don't - please report), and will work when it becomes release as well.

hron84 commented 8 years ago

@bdew cool, I'll test it out as soon as possible