jslhs / pyrit

Automatically exported from code.google.com/p/pyrit
0 stars 0 forks source link

Pyrit list_cores killing all server connections. #249

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. install pyrit via SVN or tarball
2. using NVIDIA 260.x.x drivers (260.19.29,260.19.36,etc)
3. using CUDA of same release version / time
4. x86 kernel Debian Lenny 2.6.37
5. setup exact same environment (drive imaged) on same exact brand laptop - 
DELL E6410 - setup Sun server AMD 64Bit version of Linux same drivers but 64 
but, etc
6. added them all to /root/.pyrit/config under rpc_known_hosts

What is the expected output? What do you see instead?
Should see all the cores from all machines (3 total)
instead I see this happen (instantly) on all machines running "pyrit serve"
Pyrit 0.4.0-dev (svn r288) (C) 2008-2010 Lukas Lueg http://pyrit.googlecode.com
This code is distributed under the GNU General Public License v3+

Serving 0 active clients; 0 PMKs/s; 0.0 TTS Traceback (most recent call last):
  File "/usr/bin/pyrit", line 6, in <module>
    pyrit_cli.Pyrit_CLI().initFromArgv()
  File "/usr/lib/python2.5/site-packages/pyrit_cli.py", line 115, in initFromArgv
    func(self, **options)
  File "/usr/lib/python2.5/site-packages/pyrit_cli.py", line 877, in serve
    server.addClient(addr)
  File "/usr/lib/python2.5/site-packages/cpyrit/network.py", line 140, in addClient
    client = NetworkClient(srv_addr, self.enqueue, known_uuids)
  File "/usr/lib/python2.5/site-packages/cpyrit/network.py", line 68, in __init__
    self.srv_uuid, self.uuid = self.server.register(";".join(known_uuids))
  File "/usr/lib/python2.5/xmlrpclib.py", line 1147, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib/python2.5/xmlrpclib.py", line 1437, in __request
    verbose=self.__verbose
  File "/usr/lib/python2.5/xmlrpclib.py", line 1185, in request
    errcode, errmsg, headers = h.getreply()
  File "/usr/lib/python2.5/httplib.py", line 1199, in getreply
    response = self._conn.getresponse()
  File "/usr/lib/python2.5/httplib.py", line 928, in getresponse
    response.begin()
  File "/usr/lib/python2.5/httplib.py", line 385, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python2.5/httplib.py", line 343, in _read_status
    line = self.fp.readline()
  File "/usr/lib/python2.5/socket.py", line 372, in readline
    data = recv(1)
socket.error: (104, 'Connection reset by peer')

What version of the product are you using? On what operating system?
Debian Lenny version 0.3.0 failed so did latest SVN 0.4.0-dev
here is an example config file from one of the servers:

default_storage = file://
limit_ncpus = 0
rpc_announce = true
rpc_announce_broadcast = false
rpc_knownclients = 192.168.1.2 192.168.1.4
rpc_server = true
workunit_size = 75000

Please provide any additional information below.

Original issue reported on code.google.com by weaknetlabs on 28 Jan 2011 at 2:19

GoogleCodeExporter commented 8 years ago
This problem also include the svn280. (backtrack4 r2, nvidia 260.19.29)

Original comment by andor...@gmail.com on 2 Feb 2011 at 2:37

GoogleCodeExporter commented 8 years ago
Im also having issues with this.  List cores does it, I have a thread open on 
this issue as well, also, will not batch, also returns similar errors 

Serving 1 active clients; 0 PMKs/s; 0.0 TTS Exception in thread Thread-276:
Traceback (most recent call last):
  File "/usr/lib/python2.6/threading.py", line 525, in __bootstrap_inner
    self.run()
  File "/usr/local/lib/python2.6/dist-packages/cpyrit/network.py", line 50, in run
    self.server.gather(self.client.uuid, 5000)
  File "/usr/lib/python2.6/xmlrpclib.py", line 1199, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib/python2.6/xmlrpclib.py", line 1489, in __request
    verbose=self.__verbose
  File "/usr/lib/python2.6/xmlrpclib.py", line 1253, in request
    return self._parse_response(h.getfile(), sock)
  File "/usr/lib/python2.6/xmlrpclib.py", line 1392, in _parse_response
    return u.close()
  File "/usr/lib/python2.6/xmlrpclib.py", line 838, in close
    raise Fault(**self._stack[0])
Fault: <Fault 403: 'Client unknown or timed-out'>

Original comment by thefixe...@gmail.com on 3 Feb 2011 at 9:10

GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
Same error on my box see below

Serving 0 active clients; 0 PMKs/s; 0.0 TTS Traceback (most recent call last):
  File "/usr/local/bin/pyrit", line 6, in <module>
    pyrit_cli.Pyrit_CLI().initFromArgv()
  File "/usr/local/lib/python2.6/dist-packages/pyrit_cli.py", line 115, in initFromArgv
    func(self, **options)
  File "/usr/local/lib/python2.6/dist-packages/pyrit_cli.py", line 877, in serve
    server.addClient(addr)
  File "/usr/local/lib/python2.6/dist-packages/cpyrit/network.py", line 140, in addClient
    client = NetworkClient(srv_addr, self.enqueue, known_uuids)
  File "/usr/local/lib/python2.6/dist-packages/cpyrit/network.py", line 68, in __init__
    self.srv_uuid, self.uuid = self.server.register(";".join(known_uuids))
  File "/usr/lib/python2.6/xmlrpclib.py", line 1199, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib/python2.6/xmlrpclib.py", line 1489, in __request
    verbose=self.__verbose
  File "/usr/lib/python2.6/xmlrpclib.py", line 1237, in request
    errcode, errmsg, headers = h.getreply()
  File "/usr/lib/python2.6/httplib.py", line 1048, in getreply
    response = self._conn.getresponse()
  File "/usr/lib/python2.6/httplib.py", line 974, in getresponse
    response.begin()
  File "/usr/lib/python2.6/httplib.py", line 391, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python2.6/httplib.py", line 349, in _read_status
    line = self.fp.readline()
  File "/usr/lib/python2.6/socket.py", line 397, in readline
    data = recv(1)
socket.error: [Errno 104] Connection reset by peer

Original comment by thefixe...@gmail.com on 3 Feb 2011 at 9:12

GoogleCodeExporter commented 8 years ago
I have tried adjusting work unit size from 5000 all the way up to 200,000 as 
well as adjusting the other settings in the conf, on both client and head node, 
nothing fixes this issue that I have found. 

Original comment by thefixe...@gmail.com on 3 Feb 2011 at 10:23

GoogleCodeExporter commented 8 years ago
I tried checking Wireshark for errors, since pyrit/pycuda seems to be so awful 
at being verbose. I was able to connect to port 17935 NOT what the 
documentation says (19935) and see the error, finally with both machines 
tethered via one ethernet cable I was able to run "pyrit serve" on both and got 
them to play nice (somewhat). I ran "pyrit benchmark" on one machine and the 
second said there were 2 active clients and then started serving 150 pmk/s 
which normally should be 1900. ? Then, it stops and just sits there saying 
"Calibrating..." on the first machine that I ran the benchmark command on.

Original comment by weaknetlabs on 5 Feb 2011 at 5:55

GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
Yeah the farthest I can get, is if I do self test, then benchmark, and THEN 
batch, it will start running and Ill see 150+PMK on the node, and then it goes 
to 0 and eventually, I get the errors we posted above.

But that was only one time, and ONLY if I run self test, then benchmark, THEN 
batch, regardless , somethings not working properly

Im using the default file:// setting for db, I wonder if this issue would 
persist with a dedicated db server. This is what I will try next.

Original comment by thefixe...@gmail.com on 6 Feb 2011 at 2:38

GoogleCodeExporter commented 8 years ago
Hey douglas@weaknet, are you using any kind of http proxy, i see the errors 
here 

  File "/usr/lib/python2.6/httplib.py", line 1048, in getreply
    response = self._conn.getresponse()
  File "/usr/lib/python2.6/httplib.py", line 974, in getresponse
    response.begin()
  File "/usr/lib/python2.6/httplib.py", line 391, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python2.6/httplib.py", line 349, in _read_status
    line = self.fp.readline()

And this leads my to believe perhaps this is some kind of http error caused by 
a proxy perhaps?

Im using Squid 3.0 and this MAY be the cause, I do not know, but does pyrit use 
http traffick to communicate between nodes?

Original comment by thefixe...@gmail.com on 6 Feb 2011 at 4:49

GoogleCodeExporter commented 8 years ago
This probably happens because the client/server connection is shut down too 
fast. I'll look into it.

Original comment by lukas.l...@gmail.com on 9 Feb 2011 at 8:13

GoogleCodeExporter commented 8 years ago
Thank you Lukas

Original comment by fi...@weaknetlabs.com on 12 Feb 2011 at 6:51

GoogleCodeExporter commented 8 years ago
Issue persists, has not been fixed as of Pyrit 0.4.1-dev (svn r297)

Original comment by mrfantas...@aol.com on 17 Feb 2011 at 11:24

GoogleCodeExporter commented 8 years ago
Hey Lukas, since Pyrits networking appears to be broken, what would be the 
result of two seperate machines running pyrit batch on the same db?? Would 
smoke pour from my speakers and my cpu explode?

Original comment by mrfantas...@aol.com on 24 Feb 2011 at 5:18

GoogleCodeExporter commented 8 years ago
I got the same problem.

Original comment by laurent....@gmail.com on 2 Jun 2011 at 10:03

GoogleCodeExporter commented 8 years ago
same problem here

Original comment by mdoerin...@googlemail.com on 2 Jul 2011 at 3:07

GoogleCodeExporter commented 8 years ago
Same problem after configuring clients and servers.  
Thanks Lukas!

Original comment by Bcaudil...@gmail.com on 15 Aug 2011 at 4:34

GoogleCodeExporter commented 8 years ago
Is there any progress on this? I can't get pyrit to work this way.

Original comment by aser...@gmail.com on 19 Aug 2011 at 5:48

GoogleCodeExporter commented 8 years ago
socket.error: (104, 'Connection reset by peer')

Is anyone looking into this?

Original comment by paulala....@gmail.com on 30 Aug 2011 at 6:40

GoogleCodeExporter commented 8 years ago
issue not resolved, need some wizard c/python expert to come in and fix 

Original comment by frozenpo...@aol.com on 24 Oct 2011 at 10:19

GoogleCodeExporter commented 8 years ago
I have the same problem using Pyrit r308.  As I can see other people able to do 
it and it means we are missing something simple.

Original comment by Almaz...@gmail.com on 30 Oct 2011 at 7:45

GoogleCodeExporter commented 8 years ago
I seem to be getting the same errors for build r308 also.

Original comment by punk...@gmail.com on 5 Feb 2012 at 8:04

GoogleCodeExporter commented 8 years ago
It would seem that some top secret guberment agency has secretly abducted our 
hero Lukas, SOMEONE PLEASE PICK UP ON THE PYRIT PROJECT!!!?!!?!?

Original comment by MarkyPoo...@gmail.com on 3 Sep 2012 at 8:10

GoogleCodeExporter commented 8 years ago
same problem!

Original comment by Walther....@gmail.com on 10 Apr 2015 at 1:13