Closed GoogleCodeExporter closed 9 years ago
Kinda tricky bug. Good work wrapping it up.
Would it work to add a new lock around undoMoves? Which is not released in
blockTillMove?
It is mostly because it would be nice if CECPEngine could work independendly of
GameModel. But if I have overseen something - which I probably have - then I
think
your patch is a good fix, until we redesign that mess. Or drop support for cecp.
Btw., what is that Offer.offerType -> Offer.type rename :p It takes up a thid
of the
patch ;)
Original comment by lobais
on 6 Apr 2010 at 4:37
The offer.type rename is part of a different set of changes which fixes some
simpler offer handling
stuff. I'm sorry for not properly separating out the changes for bugs 548-551,
that was lazy of me, but I
wanted to keep everyone updated on what I was working on and had been putting
it off for too long and
properly separating out all the different changes I was working on which were
done on the same code
tree (svn/pychess-bugfixes) was prohibitively time-consuming. Now that were
using Mercurial though I
don't see this happening anymore... or at least, now I don't have an excuse
with cheap local copies ;).
Back when I was actively working on this bug, I did try the approach of adding
a separate lock to
CECPEngine.undoMoves first, and as far as I remember, it ran into deadlocks
where the
CECPEngine.undoMoves thread hung at
CECPEngine.__blockTillMove:self.movecon.wait() because
the move never came in via CECPEngine.__parseLine because the engine subprocess
couldn't get
either the glock (threads_enter) or the python GIL (I never figured out which)
it needed to emit the line
with the move. I even tried altering CECPEngine.__blockTillMove to make it give
up the glock while
waiting for the move, but that didn't work either (in that the subprocess
engine still couldn't emit the
line), maybe because we got there via a bunch of emits, and during an emit from
the MainThread
Python acquires the GIL (
http://www.pygtk.org/docs/pygtk/gdk-functions.html#function-gdk--threads-
init ). Basically, I came to the conclusion that the only way to do undos
correctly and without
encountering deadlocks in the current PyChess codebase is to do them without
holding the glock
which means doing them asynchronously of the user telling the program to do
them through the GUI
via "Action -> Offer Undo", which is how I arrived at the design in the patch
of doing undos in a
separate thread (via @inthread) starting at GameModel.undoMoves. And I'm pretty
sure that this is the
only way it can work until PyChess is refactored so that all GUI updates are
queued and processed
_only_ by the MainThread, which as you pointed out before, will be the single
most important change
towards stability we could make in the future.
Regardless of not being able to get it to work that way, I don't see a reason
why
CECPEngine.undoMoves should work independently of GameModel.undoMoves. As you
know, the
simple synchronous approach is always to be preferred because it's much more
simple, easy to read,
easy to debug, etc., and should be chosen by default unless it can be proved
that doing otherwise
provides significant speed benefits, and I don't see that happening here
because were not really
waiting for the engine's reply, just asynchronously sending it some commands,
and we can't really
proceed with additional GameModel.undoMoves()s anyway until the current one
completes. In fact, I'd
argue that the method taken in the patch of putting GameModel.undoMoves in a
thread should be
removed too later on after we do an overall refactor, since it's impossible to
know at the time of the
user undo request (at BoardControl.actionActivate) that it will be possible to
undo that many moves
later on once our request comes up in the queue and is processed by
GameModel.undoMoves.
In the patch you'll see I separated the CECPEngine.undoMoves (and UCIEngine)
into
CECPEngine.playerUndoMoves and CECPEngine.spectatorUndoMoves since their logic
is completely
different based on whether the engine is a player or not. Between that, all the
"if self.mode in
(ANALYZING, INVERSE_ANALYZING):" in the (CECP,UCI)Engine code, the different
*.putMove() and
*.makeMove() methods, and the different logic handling in
CECPEngine.__parseLine based on whether
were in spectator or player mode, it's starting to look like CECPEngine (and
UCIEngine) should be
broken into separate CECPEnginePlayer and CECPEngineSpectator classes which
inherit from a
common CECPEngine with all the common code.
Original comment by mattgatto
on 8 May 2010 at 1:40
Reading this again and again, I have to agree with you, that the current model
simply is a dead end on fixing this in an elegant way.
I have made some sketches for a model to isolate these things properly. I hope
to post it in the 1.0 brainstorm during the coming week. I believe it will be
waterproof and solve the problems you mention. Hopefully you'll as well be able
to remember some of the pitfalls, so we can get this thing done with.
A design fix will however have to wait until perhaps a 0.10.1 release.
Until then I highly support merging the patch. You may want to move some of the
GameModel print statements to the log module, other than that I don't see it
needs much cleaning.
Original comment by lobais
on 22 Aug 2010 at 11:08
Fixed in revision 8799a74c72.
Original comment by mattgatto
on 15 Oct 2010 at 3:06
Original issue reported on code.google.com by
mattgatto
on 5 Apr 2010 at 2:27Attachments: