nullx002 / pychess

Automatically exported from code.google.com/p/pychess
GNU General Public License v3.0
1 stars 0 forks source link

PyChess doesn't start #208

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
If i delete engines.xml and ~/.pychess to simulate a new install, and have
installed gnuchess and crafty, i get this error, and PyChess can't start.
(If i have no gnuchess and crafty, start is OK.)

tamas@tami:~/svn/pychess$ PYTHONPATH=lib ./pychess
2007-09-15 19:11:53 Default Error: Thread <Worker(Thread-5, started
daemon)> in threadpool raised following error:
Traceback (most recent call last):
  File "/home/tamas/svn/pychess/lib/pychess/System/ThreadPool.py", line 51,
in run
    self.func()
  File "/home/tamas/svn/pychess/lib/pychess/System/ThreadPool.py", line 33,
in <lambda>
    a.func = lambda: func(*args)
  File "/home/tamas/svn/pychess/lib/pychess/Players/engineNest.py", line
185, in _findOutMore
    e = self.initEngine (engine, WHITE)
  File "/home/tamas/svn/pychess/lib/pychess/Players/engineNest.py", line
384, in initEngine
    return ProtocolEngine( protocol(subprocess, color) )
  File "/home/tamas/svn/pychess/lib/pychess/Players/CECPProtocol.py", line
59, in __init__
    self.board = Board(setup=True)
  File "/home/tamas/svn/pychess/lib/pychess/Utils/Board.py", line 24, in
__init__
    self._applyFen (FEN_START)
  File "/home/tamas/svn/pychess/lib/pychess/Utils/Board.py", line 34, in
_applyFen
    for cord in iterBits(wpieces[PAWN]):
  File "/home/tamas/svn/pychess/lib/pychess/Utils/lutils/bitboard.py", line
35, in iterBits
    return bitsArray0[bitboard >> 48] + \
TypeError: unsubscriptable object

WARNING: Giving 10 minutes for loading engine PyChess 0.8beta1

Original issue reported on code.google.com by gbtami on 15 Sep 2007 at 5:16

GoogleCodeExporter commented 8 years ago
Strange, I'm unable to reproduce this.
Can you try again with 603?

Original comment by lobais on 17 Sep 2007 at 2:16

GoogleCodeExporter commented 8 years ago
With me is happening the same

Original comment by leogregianin@gmail.com on 17 Sep 2007 at 3:40

GoogleCodeExporter commented 8 years ago
Seems that broblem occurs when one process (gnuchess, crafty or pychess) 
started to
create the /tmp/bitboars file, while others reading it before it was finished. 
Maybe.

Original comment by gbtami on 17 Sep 2007 at 5:36

GoogleCodeExporter commented 8 years ago
Ok, could one of you try to insert "    print bitsArray0" between line 33 and 
34 in
Utils.lutils.bitboard.py?
If stuff works correctly it will output quite a lot of data, but just before the
error is raised it seams to be something not right. Perhaps None.

If it is None, perhaps can you try to add a Lock arround the iterBits function, 
so
it'll be:

from threading import Lock
lock = Lock()
def iterBits (bitboard):
    lock.acquire()    
    try:
        ensureBitArraysLoaded ()
        return ...
    finally:
        lock.release()

Original comment by lobais on 17 Sep 2007 at 5:54

GoogleCodeExporter commented 8 years ago
This doesn't helped, but seems changing in engineNest.py line 185
from
        e = self.initEngine (engine, WHITE)
to
        e = self.initEngine (engine, BLACK)
avoids the race in engine discovering processes, as this not triggers _applyFen 
in
Board __init__.

With this workaround i get no errors, but one timing/race problem remaining.
PyChess sometime not recognise all 3 engines as analyzing, only 2 :(

Original comment by gbtami on 17 Sep 2007 at 7:30

GoogleCodeExporter commented 8 years ago
I don't see how it can be about multiple processes reading/writing to 
/tmp/bitboards
at the same time, by two reasons: It doesn't work to put a lock around 
iterBits, and
even in this case bitsArray0 should still have been initied as a list.

Can you send me a log copy in one of those cases, when 2 rather than 3 engines 
are
discovered?

Original comment by lobais on 18 Sep 2007 at 6:04

GoogleCodeExporter commented 8 years ago
On my office PC with an old Celeron (no muliple procs!) i deleted 
/temp/bitboards and
~/.pychess.
In svn 603 i made only above change (WHITE to BLACK), so /tmp/bitboards not 
playing here.
Started pychess, and it dicovered PyChess and gnuchess OK, but Crafty not OK.

Original comment by gbtami on 18 Sep 2007 at 8:11

Attachments:

GoogleCodeExporter commented 8 years ago
I was able to reproduce this on a gutsy tribe 5 livecd.
However unlike you I was able to fix it by adding locks arround iterBits.
On the other hand, I don't really like locking in a hyper optimized function 
like
this, so I added a filelocking in the ensureBitArraysLoaded function using 
fcntl.flock.

I committed it as rev 604. I hope we got it.
Besides that I think changing the color from WHITE to BLACK is a good idea, as 
it'll
probably speed up loading.

Original comment by lobais on 18 Sep 2007 at 3:36

GoogleCodeExporter commented 8 years ago
Now is running well.

Original comment by leogregianin@gmail.com on 18 Sep 2007 at 4:18

GoogleCodeExporter commented 8 years ago
Confirmed. Your fix solved the original bug.

The other problem still exist (with or without WHITE to BLACK change).
PyChess can always discovers all the 3 engines, but randomly the third's 
features are
not copleted: <feature command="analyze" supports="false"/> in the attached 
engines.xml
Seems changing the timeout to 3*1000 in CECPProtocol.py line 86 fixes this too.

Original comment by gbtami on 18 Sep 2007 at 8:34

GoogleCodeExporter commented 8 years ago
Does this happen everytime or only when bitboards have to be created?
I'm thinking of adding a protover2 attribute in the hardcoded xml. This way we 
could
totally skip the timeout and take the time it takes for engines we know support
protover2 eventually.

Original comment by lobais on 18 Sep 2007 at 10:18

GoogleCodeExporter commented 8 years ago
Everytime.

Original comment by gbtami on 19 Sep 2007 at 6:42

GoogleCodeExporter commented 8 years ago
Ok. Revision 609 will give 10 secconds to init for engines which are known to 
support
protover 2.
I guess this should fix stuff. Please change status if it does.

Original comment by lobais on 19 Sep 2007 at 6:38

GoogleCodeExporter commented 8 years ago
With svn 609. Now PyChess was detected as not analyzing.

Original comment by gbtami on 19 Sep 2007 at 7:09

Attachments:

GoogleCodeExporter commented 8 years ago
Here is another 2 try.

Original comment by gbtami on 19 Sep 2007 at 7:49

Attachments:

GoogleCodeExporter commented 8 years ago
The above logs was all produced on a P4 3GHz(HT) at home with svn 609.
On my office PC with a Celeron 2.66 GHz(no HT) no problem with svn 609!
Both runs the same 2.6.17 SMP kernel, and 2.10.14 gtk2 and 2.10.4 pygtk.

Original comment by gbtami on 20 Sep 2007 at 7:00

GoogleCodeExporter commented 8 years ago
Was the last logfile 2007-09-19_21-15-49.log produced on your office PC? It has 
got
the  "2007-09-19 21:15:49 Python Debug: feature draw=1 reuse=0 myname='PyChess
0.8beta1' usermove=1 setboard=1 sigterm=1 analyze=1" lines that the others need.

You could try to run the engine on its own (You can read how in the INSTALL) 
and see
if it blocks or something.

Btw. Why do you run an SMP kernel on a non HT processor? Isn't SMP desgined for
exactly executing two threads one one kernel with HT?

Original comment by lobais on 20 Sep 2007 at 2:14

GoogleCodeExporter commented 8 years ago
All the logs i'v attached in this issue was produced on my home pc.

In the 2007-09-19_21-15-49.log PyChess was recognised as analyzing engine, but 
crafty
and gnuchess not !

As i can summarise, on multi processor/HT pychess randomly recognise engine 
features.

There is no problem on office pc (Celeron). My distrib ships 1(SMP enabled) 
kernel only.

Original comment by gbtami on 20 Sep 2007 at 5:19

GoogleCodeExporter commented 8 years ago
I found a dualcore to do some testing on, and I believe I found and fixed a 
nobrainer
in the bitboard locking.
Can you try out revision 614.

Original comment by lobais on 21 Sep 2007 at 12:58

GoogleCodeExporter commented 8 years ago
As an extra stress test, it would be interesting to see "PYTHONPATH=lib/ 
./pychess &
PYTHONPATH=lib/ ./pychess" run on a multicore. It works fine for me, but on a
multicore we might get the same problems on the engines.xml and temp.png files, 
we
currently have on the bitbaords file.

Original comment by lobais on 21 Sep 2007 at 2:41

GoogleCodeExporter commented 8 years ago
svn 614 doesn't fix the problem.
After you changed in engineNest.py from
        e = self.initEngine (engine, WHITE)
to
        e = self.initEngine (engine, BLACK)
bitboard.py is not playing in this bug anymore, so the problem lies somewhere 
else :(
You can test the problem only without engines.xml, so you have to delete 
~/.pychess

Original comment by gbtami on 21 Sep 2007 at 6:25

GoogleCodeExporter commented 8 years ago
Btw. why don't you use Python's builtin subprocess module?

Original comment by gbtami on 21 Sep 2007 at 6:27

GoogleCodeExporter commented 8 years ago
Hm, I felt like the change in bitboards solved the problem for the dualcore 
machine
on which I tested (by deleting ~/.pychess as you describe).
But you are naturally right that bitboards aren't even created before the first 
game
is started, so I'll go and look for another reason.

submodule has no support for pty which is necessary when we want engines to die
together with PyChess.
GlChess actually copied the PyChess SubProcess code. Earlier when glchess died 
the
engine would continue searching the three with a 100% cpu usage, until the user
killed it manually.

Original comment by lobais on 21 Sep 2007 at 6:51

GoogleCodeExporter commented 8 years ago
Tested on ubuntu gutsy. Same random result.
Then i switched off HT in the BIOS, and installed sjeng too.
I got the correct result in 8/10 cases. So i was able to reproduce the bug on a 
non
multiprocessor environment too.

Original comment by gbtami on 23 Sep 2007 at 11:08

GoogleCodeExporter commented 8 years ago
Can you try a few tests again with rev 679?
If you still get problems once in a while, try to edit
lib/pychess/System/SubProcess.py in line 48, and move the # from initGlc to 
initPty.
Then the glchess way will be used. It uses no pty, but spawns an awfull lot of
processes and doesn't always kill them probably when glchess is itself killed.

Original comment by lobais on 6 Oct 2007 at 10:11

GoogleCodeExporter commented 8 years ago
With r679 i get the same random result.
Changing to use initGlc i get:
tamas@tami:~$ p
2007-10-07 14:04:44 Default Error: Thread <Worker(Thread-5, started daemon)> in
threadpool raised following error:
Traceback (most recent call last):
  File "/home/tamas/svn/pychess/lib/pychess/System/ThreadPool.py", line 51, in run
    self.func()
  File "/home/tamas/svn/pychess/lib/pychess/System/ThreadPool.py", line 33, in <lambda>
    a.func = lambda: func(*args)
  File "/home/tamas/svn/pychess/lib/pychess/Players/engineNest.py", line 243, in
_findOutMore
    e.kill(UNKNOWN_REASON)
UnboundLocalError: local variable 'e' referenced before assignment

Original comment by gbtami on 7 Oct 2007 at 12:06

GoogleCodeExporter commented 8 years ago
It was with r682.

Original comment by gbtami on 7 Oct 2007 at 12:13

GoogleCodeExporter commented 8 years ago
Can you post the log.

Original comment by lobais on 7 Oct 2007 at 2:01

GoogleCodeExporter commented 8 years ago
log attached

Original comment by gbtami on 7 Oct 2007 at 2:34

Attachments:

GoogleCodeExporter commented 8 years ago
next one

Original comment by gbtami on 7 Oct 2007 at 2:47

Attachments:

GoogleCodeExporter commented 8 years ago
next

Original comment by gbtami on 7 Oct 2007 at 3:03

Attachments:

GoogleCodeExporter commented 8 years ago
with initPty()

Original comment by gbtami on 7 Oct 2007 at 3:09

Attachments:

GoogleCodeExporter commented 8 years ago
next

Original comment by gbtami on 7 Oct 2007 at 3:20

Attachments:

GoogleCodeExporter commented 8 years ago
it's only waiting

Original comment by gbtami on 7 Oct 2007 at 3:34

Attachments:

GoogleCodeExporter commented 8 years ago
Seams to be fixed in rev 687

Original comment by lobais on 7 Oct 2007 at 5:13

GoogleCodeExporter commented 8 years ago
It's a phoenix! After i cleared my changes, and made an svn up, tried it again, 
and
it produced random results again. Today i tested on my old offive PC (Celeron), 
and
asked my collegue, to test it on hes Ubuntu Feisty (Celeron too), we got 
atacched logs.
(The 2007-10-08_09-31-50.log contains a "Szegmentációs hiba"="Segmentetion 
error"
too, huh.)

Original comment by gbtami on 8 Oct 2007 at 8:05

Attachments:

GoogleCodeExporter commented 8 years ago
I think 704 solves this for me.

Original comment by lobais on 12 Oct 2007 at 11:53

GoogleCodeExporter commented 8 years ago
1. It's getting better :)
2. Sometimes glaurung is waiting for something, and pychess gui doesn't start, 
so
   i have to press a Ctrl+C to quit. see /home/tamas/2007-10-22_10-05-55.log
3. Sometimes sjeng doesn't recognised. see /home/tamas/2007-10-22_10-10-03.log

4. After rebooted to ubuntu gutsy, i get these
   /home/tamas/2007-10-22_10-22-44.log
   /home/tamas/2007-10-22_10-26-37.log

All in all, it's not perfect yet :(

Original comment by gbtami on 22 Oct 2007 at 8:54

Attachments:

GoogleCodeExporter commented 8 years ago
Strange, did you notice:
10:05:56 stdout Debug: path
10:07:19 Glaurung Log: stop

and in initPty we have:
print "path", self.path, "args", self.args

So glaurung actually hang while printing.

Hm, I'll have to download it to do some testing.

Original comment by lobais on 22 Oct 2007 at 11:31

GoogleCodeExporter commented 8 years ago
I think there might be a problem when quickly forking a ton of ptys.
Try in engineNest.py line 376 to change "1 or 0" to "1 or 2"

Original comment by lobais on 22 Oct 2007 at 4:09

GoogleCodeExporter commented 8 years ago
now

Original comment by gbtami on 23 Oct 2007 at 5:42

Attachments:

GoogleCodeExporter commented 8 years ago
Once again I declare this bug for fixed.

Original comment by lobais on 23 Oct 2007 at 6:30