ericmckean / chromedriver

Automatically exported from code.google.com/p/chromedriver
0 stars 0 forks source link

locate by xapth is very slow #991

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I'm used xpath to loacate web elements, but it is very slow, how to fix it?

Original issue reported on code.google.com by 527901...@qq.com on 5 Dec 2014 at 7:28

GoogleCodeExporter commented 9 years ago
do you have small test case that we can verify?
I have used xpath,  and it is ok.

Original comment by andrewch...@chromium.org on 5 Dec 2014 at 11:05

GoogleCodeExporter commented 9 years ago
Sorry, my English is not good.I use selenium.
Test case:
http://s.taobao.com/search?q=%C1%EC%B4%F8&js=1&stats_click=search_radio_all%253A
1&initiative_id=staobaoz_20141208

This page may have different CSS style, and I need to find the commodity and 
the next two elements, so I used a variety of XPath, very slow.

//*/div/div/h3/a
//*[@id='mainsrp-itemlist']/div/div/div/div/div[3]/a
//*[@id='J_relative']/div[1]/div/div[2]/ul/li[3]/a

Thanks!
Best wish!

Original comment by 527901...@qq.com on 8 Dec 2014 at 1:55

GoogleCodeExporter commented 9 years ago
do you have complete test procedure - include your ready to run client software 
?

Original comment by andrewch...@chromium.org on 8 Dec 2014 at 6:15

GoogleCodeExporter commented 9 years ago
This is test case
http://yunpan.cn/cfcrQTT3Hs4xg (code:4d0d)

This is a java maven project. The test class is GetContentPage.class.If you 
have any questions please contact me.
Thanks.

Original comment by 527901...@qq.com on 9 Dec 2014 at 2:02

GoogleCodeExporter commented 9 years ago
Have you tested? What I can offer you?

Original comment by 527901...@qq.com on 10 Dec 2014 at 8:06

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
http://yunpan.cn/cfuEI9nw3uYWN (cpde:b787)

Please try this one.

Original comment by 527901...@qq.com on 12 Dec 2014 at 3:06

GoogleCodeExporter commented 9 years ago
1. I would be able to run this time.
2. Can you point out which file and line number that you  think it is slow?
3. what benchmark  make you feel it is slow?  
4. Between XpathTest.zip and XpathTest1.zip,  other than jar files, any codes 
changes.

Original comment by andrewch...@chromium.org on 12 Dec 2014 at 8:06

GoogleCodeExporter commented 9 years ago
XpathTest.zip and XpathTest1.zip two is no different, only the 
GetContentPage.java Chinese translation into English.

GetContentPage.java thirty-fifth line and fifty-seventh line through XPath 
positioning is very slow, time-consuming more than 30 seconds, I think less 
than 3 seconds to be acceptable.

Original comment by 527901...@qq.com on 13 Dec 2014 at 3:14

GoogleCodeExporter commented 9 years ago
line 35 - 57 is slower compare with FireFox?

Original comment by andrewch...@chromium.org on 15 Dec 2014 at 6:59

GoogleCodeExporter commented 9 years ago
I have been compared., Firefox browser also took more than 30 seconds

Original comment by 527901...@qq.com on 17 Dec 2014 at 12:57

GoogleCodeExporter commented 9 years ago
Can you try IE to get some benchmark?
I will look into it  today.

Original comment by andrewch...@chromium.org on 17 Dec 2014 at 6:50

GoogleCodeExporter commented 9 years ago

1. from Line 35 - 57,   it skip since no item found.
2. the only found slow may be 
elements = driver.findElements(By.xpath("//*/div/div/h3/a"));

Original comment by andrewch...@chromium.org on 18 Dec 2014 at 12:55

GoogleCodeExporter commented 9 years ago
This Webpage CSS has many styles, different time open may have different 
styles, so I made two judgments. The xpah is mainly to find goods URL and URL 
on next page, I want to get the goods through the automatic page. At present, 
the XPath are very slow to find goods.

Original comment by 527901...@qq.com on 18 Dec 2014 at 1:00

GoogleCodeExporter commented 9 years ago
XPath localization of IE is slower than chrome and Firefox.

Original comment by 527901...@qq.com on 18 Dec 2014 at 1:03

GoogleCodeExporter commented 9 years ago
How do I solve this problem now?

Original comment by 527901...@qq.com on 18 Dec 2014 at 1:15

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
you have List<WebElement> reloadBtns = 
driver.findElements(By.id("reload-button"));
slower as well.   regardless xpath
javascript need find all the elements on DOM.

you also have following very fast

elements = 
driver.findElements(By.xpath("//*[@id='mainsrp-itemlist']/div/div/div/div/div[3]
/a"));

Line 118   nextPageElements = 
driver.findElements(By.xpath("//*[@id='J_relative']/div[1]/div/div[2]/ul/li[3]/a
"));

L.63            List<WebElement> merchants = 
driver.findElements(By.xpath("//*[@id='mainsrp-itemlist']/div/div/div/div["+i+"]
/div[4]/div[1]/a/span[2]"));

if you can change algorithm to see any improvement?

Original comment by andrewch...@chromium.org on 18 Dec 2014 at 6:32

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
In fact, I am not familiar with the syntax of the XPath, these XPath are 
generated by the tool. If I change the XPath will be a little difficult, is 
there any other solutions?

Original comment by 527901...@qq.com on 26 Dec 2014 at 9:17

GoogleCodeExporter commented 9 years ago
algorithm  is very important on XML based system like xpath.
tool generated codes are not efficient most of the time.
tweaking xpath is big subject.  following are some documents may help 
http://www.ccs.neu.edu/home/lieber/tech/scardina.pdf
http://stackoverflow.com/questions/3782618/xpath-evaluate-performance-slows-down
-absurdly-over-multiple-calls

Original comment by andrewch...@chromium.org on 26 Dec 2014 at 6:49

GoogleCodeExporter commented 9 years ago
have you resolved the issue?

Original comment by andrewch...@chromium.org on 9 Jan 2015 at 6:27

GoogleCodeExporter commented 9 years ago
Now I practice is manual positioning of goods, after finishing the other 
function, have free time to modify xpath. Overall, I still did not solve the 
problem.

Original comment by 527901...@qq.com on 10 Jan 2015 at 12:46

GoogleCodeExporter commented 9 years ago
I am looking into this issue. any xpath search should not take that long.
may in touch with 527901908@qq.com direct.  
have a few questions to ask 

Original comment by andrewch...@chromium.org on 10 Jan 2015 at 12:55

GoogleCodeExporter commented 9 years ago
Just ask me, in order to complete this project I am very willing to cooperate.

Original comment by 527901...@qq.com on 10 Jan 2015 at 12:59

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
1. 527901908@qq.com is my account for http://www.taobao.com
2. I use chrome development program, called XPath tool.I get area is the 
business name and commercial links, because I need to specify the names of the 
companies access to the specified commodity link, and then click go in. 

Original comment by 527901...@qq.com on 10 Jan 2015 at 1:37

GoogleCodeExporter commented 9 years ago
you have 
reloadBtns = driver.findElements(By.id("reload-button"));
in
http://s.taobao.com/search?q=%C1%EC%B4%F8&commend=all&ssid=s5-e&search_type=item
&sourceId=tb.index&spm=1.7274553.1997520841.1&initiative_id=tbindexz_20141209

 How do you/tool get "reload-button"?
can not find it.

Original comment by andrewch...@chromium.org on 10 Jan 2015 at 1:48

GoogleCodeExporter commented 9 years ago
This is not the taobao.com button, the Chrome browser to load Webpage failure 
will appear this button.Sometimes because of the network load Webpage failed, 
so I have to refresh the Webpage until loaded successfully.

Original comment by 527901...@qq.com on 10 Jan 2015 at 2:14

GoogleCodeExporter commented 9 years ago
can you write  a stand alone small program(only few lines) based on your 
application

contain any one of it

L.156   goods = driver.findElements(By.xpath("//*/div/div/h3/a"));

L.36.  elements = driver.findElements(By.xpath("//*/div/div/h3/a"));

 if you can reproduce the case.

Original comment by andrewch...@chromium.org on 17 Jan 2015 at 2:02

GoogleCodeExporter commented 9 years ago
This is the test case.
http://yunpan.cn/cyhuDKyj24EdQ (code:cd5c)

Original comment by 527901...@qq.com on 19 Jan 2015 at 3:27

GoogleCodeExporter commented 9 years ago
1. good, this will simplify the process.
2. based on the page,  how do you generate  the "//*/div/div/h3/a" with your 
tool?
   I would like to repeat the step.

Original comment by andrewch...@chromium.org on 19 Jan 2015 at 6:52

GoogleCodeExporter commented 9 years ago
I found this website has changed the CSS style, the XPath is unable to locate 
any elements. I used tool called XPath tool 0.0.1. All of my XPath is the tool 
to generate, I need the positioning element is the name of commodity URL and 
company name.

Original comment by 527901...@qq.com on 20 Jan 2015 at 3:33

GoogleCodeExporter commented 9 years ago
based on their changes,  can you modify 

L.156   goods = driver.findElements(By.xpath("//*/div/div/h3/a"));

L.36.  elements = driver.findElements(By.xpath("//*/div/div/h3/a"));

L31     List<WebElement> reloadBtns = driver.findElements(By.id("reload-button"));

   to meet your need.

Original comment by andrewch...@chromium.org on 20 Jan 2015 at 6:26

GoogleCodeExporter commented 9 years ago
//*[@id="J_itemlistCont"]/div/div[3]  this can meet my need.
List<WebElement> reloadBtns = driver.findElements(By.id("reload-button")); This 
button appear when Chrome browser loading Webpage failure.

Original comment by 527901...@qq.com on 21 Jan 2015 at 2:14

GoogleCodeExporter commented 9 years ago
//*[@id='J_itemlistCont']/div/div[3]    seem very fast
it  will replace //*/div/div/h3/a ?
L.156   goods = driver.findElements(By.xpath("//*/div/div/h3/a"));
L.36.  elements = driver.findElements(By.xpath("//*/div/div/h3/a"));

if so,  you have only one problem left
L31     List<WebElement> reloadBtns = driver.findElements(By.id("reload-button"));

Original comment by andrewch...@chromium.org on 21 Jan 2015 at 10:48

GoogleCodeExporter commented 9 years ago
Yes, //*[@id='J_itemlistCont']/div/div[3] will replace //*/div/div/h3/a. It 
seems xpath is best to begin with ID.
As List<WebElement> reloadBtns = 
driver.findElements(By.id("reload-button"));,It seems that you understand what 
I mean.This page create by chrome, not taobao.com.So, now all the problems are 
solved.
Thank you very much!

Original comment by 527901...@qq.com on 22 Jan 2015 at 12:39

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
your driver.findElements(By.id("reload-button"))
will still be executed. are you going to remove it?

Original comment by andrewch...@chromium.org on 22 Jan 2015 at 1:43

GoogleCodeExporter commented 9 years ago
https://support.google.com/chrome/answer/113910
"reload-button" is a button that comes with chrome Webpage.

Original comment by 527901...@qq.com on 22 Jan 2015 at 2:51

GoogleCodeExporter commented 9 years ago
You work very seriously. Do you work at Google?

Original comment by 527901...@qq.com on 22 Jan 2015 at 6:06

GoogleCodeExporter commented 9 years ago
yes

Original comment by andrewch...@chromium.org on 22 Jan 2015 at 6:41

GoogleCodeExporter commented 9 years ago

Original comment by andrewch...@chromium.org on 22 Jan 2015 at 11:08

GoogleCodeExporter commented 9 years ago

Original comment by samu...@chromium.org on 21 Feb 2015 at 12:26