Closed spooning closed 9 years ago
Originally submitted to Google Code by kai.hackemesser on 31 Jan 2010
Here's a screenshot of how it starts to look after just 10 test cases. The left window shows the Jython process running Jybot, the right one is the Selenium-Server. From my point of view the connections to the selenium server are not reused nor properly closed for garbage collection.
Originally submitted to Google Code by @pekkaklarck on 1 Feb 2010
Have you tested running same tests with pybot?
Originally submitted to Google Code by kai.hackemesser on 1 Feb 2010
That's technically difficult, as we wrote java extension libraries for some steps in the middle. We rely on jybot.
Originally submitted to Google Code by @pekkaklarck on 1 Feb 2010
Could you run at least some of the tests, or parts of them, with pybot? Unfortunately we don't have much time to investigate this at the moment, so the more information you could provide the better.
Originally submitted to Google Code by kai.hackemesser on 1 Feb 2010
I did a test run on pybot now (limited as I had to remove some keywords), and it looks ok there, so the problem is somewhere between jybot and selenium server.
Originally submitted to Google Code by kai.hackemesser on 1 Feb 2010
I could fix that issue for jybot by modifying the file selenium.py. in _docommand there was a new HTTP connection opened for each Selenium command. I made it to a class member instead and it would only created conditionally on the first Selenium server call.
before:
conn = httplib.HTTPConnection(self.host, self.port)
body = u'cmd=' + urllib.quote_plus(unicode(verb).encode('utf-8'))
for i in range(len(args)):
body += '&' + unicode(i+1) + '=' +
urllib.quote_plus(unicode(args[i]).encode('utf-8')) if (None != self.sessionId): body += "&sessionId=" + unicode(self.sessionId) headers = {"Content-Type": "application/x-www-form-urlencoded; charset=utf-8"} conn.request("POST", "/selenium-server/driver/", body, headers)
response = conn.getresponse()
after:
if(None == self.conn):
self.conn = httplib.HTTPConnection(self.host, self.port)
body = u'cmd=' + urllib.quote_plus(unicode(verb).encode('utf-8'))
for i in range(len(args)):
body += '&' + unicode(i+1) + '=' +
urllib.quote_plus(unicode(args[i]).encode('utf-8')) if (None != self.sessionId): body += "&sessionId=" + unicode(self.sessionId) headers = {"Content-Type": "application/x-www-form-urlencoded; charset=utf-8"} self.conn.request("POST", "/selenium-server/driver/", body, headers)
response = self.conn.getresponse()
and in the init method I added:
self.conn = None
Note: there is still no code that would close the connection, but as the library gets usually loaded only once between start and termination this would not mean a leak any more.
As I don't have source code access maybe someone else patches it for the public.
Originally submitted to Google Code by kai.hackemesser on 1 Feb 2010
My colleague mentioned a patch file would be more helpful. Here it is.
Originally submitted to Google Code by @pekkaklarck on 2 Feb 2010
Awesome that you were able to find the root cause and even patch it! Notice that selenium.py is the Selenium Python API provided by the Selenium project. Since we have a version of it bundled to the SeleniumLibrary we can easily include fixes to it into our releases, but we should also send the patch upstream. Do you have time and energy for that?
Janne, could you also check is our version of the selenium.py the latest one?
Originally submitted to Google Code by kai.hackemesser on 2 Feb 2010
I would do so but I'm still testing my patched version. I found one case where my patch renders the SeleniumLibrary useless when the http request failed for any weird reason. A bad response state renders the connection useless. So I need to catch that exception and set the connection to null. I will send a second version of that patch soon. After that, if you provide me a link to the right place I can forward that patch to the Selenium project as well. After all I am a Java developer and my work needs to be reviewed carefully.
Cheers, Kai
Originally submitted to Google Code by @pekkaklarck on 2 Feb 2010
Selenium issue tracker seems to be located at http://jira.openqa.org/browse/SEL
Originally submitted to Google Code by kai.hackemesser on 2 Feb 2010
patch that catches http exception and closes/disposes the connection then
Originally submitted to Google Code by @yanne on 3 Feb 2010
Can you try whether just closing the connection in the end of do_command would be enough. This would simplify the patch considerably. Applying the current patch directly requires changing also the library to get Start Selenium Server working properly, since we don't currently handle HttpException.
Originally submitted to Google Code by kai.hackemesser on 3 Feb 2010
I've tested that before, and that would fix it, too. Anyway, the thrown exception is just exposed here, it would have been thrown from the httplib under different conditions, too.
Originally submitted to Google Code by @pekkaklarck on 3 Feb 2010
The underlying problem here is most likely caused by do_command
relying on Python
garbage collector collecting conn
as soon as do_command
is exited. I assume that
Python socket's are implemented so that they are closed automatically when they are
collected. Because Python uses reference counting garbage collector, it immediately
detects when conn
is not used anymore and thus the code works even though the
connection is never explicitly closed.
The problem appears on Jython because there garbage is collected by the JVM which uses different algorithm than reference counting. That obviously leaves connections open, and when enough of them are opened they run out.
The easiest and safest solution is just adding conn.close()
to do_command
.
Reusing the socket is at least theoretically a better (=faster) solution because
there's no need to open sockets all the time. That is, however, also more risky. As
Janne wrote, Start Selenium Server
stopped working correctly with this patch. It
currently only catches socket.error
and after the patch was applied there was also
HttpException
. We aren't entirely sure why that happens, which makes accepting the
patch in this form a bit risky. We definitely don't want this to break existing tests.
(Notice also that to expose the original error, you should use just raise
and not
raise Exception, self.conn
, especially when you've just executed self.conn = None
.)
Originally submitted to Google Code by kai.hackemesser on 3 Feb 2010
What exactly goes wrong with 'Start Selenium Server'? In our test suite it behaves well with the patch. (Or maybe this is again a Python/Jython issue?)
Of course you are right with the raise, that was silly from me :D I had difficulties to test failings. and I'm not absolutely firm with the syntax yet (I have about 4h experience with python yet)
Originally submitted to Google Code by @pekkaklarck on 3 Feb 2010
Ooops, the problem isn't in start_selenium_server but in open_browser and, more
precisely, in _connect_to_selenium_server. There's a loop that tries to connect to
the server for 10 seconds to handle the situation that the server is slow to start.
The actual connection is done inside except socker.error
and with the patch the
executed code also raised HttpException.
We noticed the problem when running the acceptance tests of the SeleniumLibrary and cannot comment on how often that happens in "real life". For example in your case the server might just start so fast that you don't see the problem at all. Anyway, the change feels a big risky without further investigation.
Do test run faster when you run them with the version where sockets are reused? If
there's no considerable difference, I'd be very happy with simply adding
conn.close()
. That kind of a change ought to be easy to push upstream to the
Selenium project also.
Btw, if you are interested in learning more Python I highly recommend Dive Into
Python book which is freely available at http://diveintopython.org. It's targeted for
people with previous programming experience -- the first example isn't even close to
print "Hello, world!"
. There are some other good references at
https://code.google.com/p/robotframework/wiki/PythonTutorial
Originally submitted to Google Code by kai.hackemesser on 3 Feb 2010
Thanks for the book hint, I will for sure dive into it :D From my subjective point the selenium communication is indeed a bit faster. Not much. I can live with a local conn object and local closing it.
So I'll send a new patch in for this for review.
Originally submitted to Google Code by Andreas.EbbertKarroum on 4 Feb 2010
Thanks :) I was just at the same point in the code and wondered if a conn.close() would be enough. Thanks for providing the fix. Can we have an official new SeleniumLibrary? IMHO it's worth it for this fix alone, as any serious usage is impossible without it.
Originally submitted to Google Code by @yanne on 4 Feb 2010
SeleniumLibrary 2.2.3 will be released next week and will include the modified patch to the remote control (selenium.py).
Originally submitted to Google Code by @pekkaklarck on 4 Feb 2010
Kai, would you be interested in submitting a bug report about this into Selenium tracker too? Feel free to reference this issue.
Originally submitted to Google Code by kai.hackemesser on 4 Feb 2010
Already done, http://jira.openqa.org/browse/SEL-733
Originally submitted to Google Code by @yanne on 8 Feb 2010
The patch, slightly modified, is now applied in the trunk.
I also made a crude test (by grepping the selenium.py), which should catch if we update the remote control without applying the patch.
Thanks Kai for your work on resolving the issue.
Originally submitted to Google Code by @yanne on 8 Feb 2010
r295 contains the patched selenium.py (And due to my misuse of svn, some other changes too)
Originally submitted to Google Code by Andreas.EbbertKarroum on 10 Feb 2010
FYI, I created an issue for the selenium library itself: http://jira.openqa.org/browse/SRC-817
Originally submitted to Google Code by @pekkaklarck on 10 Feb 2010
The underlying bug was already reported as http://jira.openqa.org/browse/SEL-733 but http://jira.openqa.org/browse/SRC-817 is somewhat more describing and also contains fix. Could someone add a note that these issues are duplicates to the Selenium tracker?
Originally submitted to Google Code by @pekkaklarck on 5 Aug 2010
This leak has now been fixed in the official selenium.py version as discussed here: http://groups.google.com/group/robotframework-devel/browse_thread/thread/91df60b5c26b1189
I submitted issue 131 about taking the latest official version into use.
I'm not sure if this is an issue of the selenium library or selenium itself. Could even be Jython.
We have a robot test suite which has grown a lot. somewhere in the middle of the run I get a javs.net.SocketException because no connections are available any more. I used Process Explorer to trace the problem and I found that the Jython process opens thousands of connections to port 4444 (Selenium Server). Many of them are in CLOSE_WAIT state, maybe 50 are ESTABLISHED. Our suite is set up to open the the Browser only once in the overall suite setup, and it will close it in the suite teardown.
So why is the robot creating so many connections to port 4444?
We are using the latest release of robot framework (2.1.2), robot selenium library (2.2.2), jython (2.5.1)
(this was posted previously in https://code.google.com/p/robotframework/issues/detail?id=462 )