Norconex / crawlers

Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or filesystem to various data repositories such as search engines.
https://opensource.norconex.com/crawlers
Apache License 2.0
183 stars 67 forks source link

How do I use chromedriver in Collector 3.0? #733

Closed OkkeKlein closed 3 years ago

OkkeKlein commented 3 years ago

I managed to get Firefox running but to see if some issues in Firefox would be resolved when using Google Chrome, I could not get it working

Google Chrome 88.0.4324.96 and webdriver

<browserPath>/opt/google/chrome/google-chrome</browserPath>
<driverPath>chromedriver</driverPath>

example URL https://www.voordeelvloeren.nl/faq/onderwerp/top-10

essiembre commented 3 years ago

The solution in #732 does not appear to work for me here. I am missing the FAQs on your page. I told it to wait for elements of your FAQs such as h6 or app-faq to no success. It is as if the FAQs are not loaded by the web driver. I noticed the timeouts specified are not respected either. I will have to investigate it more. Let me know if you find something on your end in the meantime.

OkkeKlein commented 3 years ago

Before I can try anything with the FAQ I need to have setup working with Chrome. Or is it only the wait that needs to be added?

essiembre commented 3 years ago

Setting timeout values is not always working as expected depending on the web driver used. I managed to find a workaround by having the crawler itself wait for a few seconds while the web driver/browser is rendering. I was able to get your FAQs that way. This can hopefully be a viable solution for #732 and other pages with timing issues. I just released a new snapshot where you can add this to your WebDriverHttpFetcher configuration section:

<threadWait>2 seconds</threadWait>

2 seconds was enough for me.

OkkeKlein commented 3 years ago

I managed to get FAQ with firefox and threadWait. However when trying to use Google I never get it to work. Still using the versions mentioned in first comment.

essiembre commented 3 years ago

Not sure why it does not work for you with chrome. I am able to successfully crawl it with the exact same chrome driver and browser versions. I tried on Windows. Does it work for you on Windows? I wonder if you only experience this on a specific OS.

Do you get any errors? What do you get?

OkkeKlein commented 3 years ago

On Windows I can crawl with Chrome no problem.

On Linux i get using

` chrome

chromedriver
<threadWait>2 seconds</threadWait>

10:14:02.445 [Norconex Minimum Test Page/1] INFO CRAWLER_RUN_THREAD_BEGIN - Thread[Norconex Minimum Test Page/1,5,main] 10:14:02.447 [Norconex Minimum Test Page/1] INFO Browser - Creating local "ChromeDriver" web driver. 10:14:02.448 [Norconex Minimum Test Page/2] INFO CRAWLER_RUN_THREAD_BEGIN - Thread[Norconex Minimum Test Page/2,5,main] Starting ChromeDriver 88.0.4324.96 (68dba2d8a0b149a1d3afac56fa74648032bcf46b-refs/branch-heads/4324@{#1784}) on port 16098 Only local connections are allowed. Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe. ChromeDriver was started successfully. 10:14:03.344 [Norconex Minimum Test Page/2] INFO Browser - Creating local "ChromeDriver" web driver. 10:14:03.344 [Norconex Minimum Test Page/1] ERROR Crawler - Problem in thread execution. com.norconex.collector.core.CollectorException: Could not build web driver at com.norconex.collector.http.fetch.impl.webdriver.Browser$WebDriverBuilder.build(Browser.java:237) ~[norconex-collector-http-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] at com.norconex.collector.http.fetch.impl.webdriver.Browser$WebDriverSupplier.get(Browser.java:181) ~[norconex-collector-http-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] at com.norconex.collector.http.fetch.impl.webdriver.WebDriverHolder.getDriver(WebDriverHolder.java:74) ~[norconex-collector-http-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] at com.norconex.collector.http.fetch.impl.webdriver.WebDriverHttpFetcher.fetcherThreadBegin(WebDriverHttpFetcher.java:242) ~[norconex-collector-http-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] at com.norconex.collector.http.fetch.AbstractHttpFetcher.accept(AbstractHttpFetcher.java:127) ~[norconex-collector-http-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] at com.norconex.collector.http.fetch.AbstractHttpFetcher.accept(AbstractHttpFetcher.java:76) ~[norconex-collector-http-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] at com.norconex.commons.lang.event.EventManager.doFire(EventManager.java:136) ~[norconex-commons-lang-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at com.norconex.commons.lang.event.EventManager.fire(EventManager.java:117) ~[norconex-commons-lang-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at com.norconex.commons.lang.event.EventManager.fire(EventManager.java:111) ~[norconex-commons-lang-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at com.norconex.collector.core.crawler.Crawler$ProcessReferencesRunnable.run(Crawler.java:992) [norconex-collector-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_275] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_275] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_275] Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_275] at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_275] at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_275] at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_275] at org.apache.commons.lang3.reflect.ConstructorUtils.invokeExactConstructor(ConstructorUtils.java:182) ~[commons-lang3-3.11.jar:3.11] at org.apache.commons.lang3.reflect.ConstructorUtils.invokeExactConstructor(ConstructorUtils.java:149) ~[commons-lang3-3.11.jar:3.11] at com.norconex.collector.http.fetch.impl.webdriver.Browser$WebDriverBuilder.lambda$build$0(Browser.java:232) ~[norconex-collector-http-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] at com.norconex.commons.lang.SystemUtil.callWithProperty(SystemUtil.java:118) ~[norconex-commons-lang-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at com.norconex.collector.http.fetch.impl.webdriver.Browser$WebDriverBuilder.build(Browser.java:222) ~[norconex-collector-http-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] ... 12 more Caused by: org.openqa.selenium.WebDriverException: unknown error: Chrome failed to start: exited abnormally. (unknown error: DevToolsActivePort file doesnt exist) (The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.) Build info: version: 3.141.59 , revision: e82be7d358 , time: 2018-11-14T08:17:03 System info: host: bmc-dev , ip: 127.0.1.1 , os.name: Linux , os.arch: amd64 , os.version: 5.4.0-51-generic , java.version: 1.8.0_275 Driver info: driver.version: ChromeDriver remote stacktrace: #0 0x559216d4c199

    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_275]
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_275]
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_275]
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_275]
    at org.openqa.selenium.remote.W3CHandshakeResponse.lambda$errorHandler$0(W3CHandshakeResponse.java:62) ~[selenium-remote-driver-3.141.59.jar:?]
    at org.openqa.selenium.remote.HandshakeResponse.lambda$getResponseFunction$0(HandshakeResponse.java:30) ~[selenium-remote-driver-3.141.59.jar:?]
    at org.openqa.selenium.remote.ProtocolHandshake.lambda$createSession$0(ProtocolHandshake.java:126) ~[selenium-remote-driver-3.141.59.jar:?]
    at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) ~[?:1.8.0_275]
    at java.util.Spliterators$ArraySpliterator.tryAdvance(Spliterators.java:958) ~[?:1.8.0_275]
    at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126) ~[?:1.8.0_275]
    at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:499) ~[?:1.8.0_275]
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:486) ~[?:1.8.0_275]
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) ~[?:1.8.0_275]
    at java.util.stream.FindOps$FindOp.evaluateSequential(FindOps.java:152) ~[?:1.8.0_275]
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:1.8.0_275]
    at java.util.stream.ReferencePipeline.findFirst(ReferencePipeline.java:531) ~[?:1.8.0_275]
    at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:128) ~[selenium-remote-driver-3.141.59.jar:?]
    at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:74) ~[selenium-remote-driver-3.141.59.jar:?]
    at org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:136) ~[selenium-remote-driver-3.141.59.jar:?]
    at org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:83) ~[selenium-remote-driver-3.141.59.jar:?]
    at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:552) ~[selenium-remote-driver-3.141.59.jar:?]
    at org.openqa.selenium.remote.RemoteWebDriver.startSession(RemoteWebDriver.java:213) ~[selenium-remote-driver-3.141.59.jar:?]
    at org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:131) ~[selenium-remote-driver-3.141.59.jar:?]
    at org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:181) ~[selenium-chrome-driver-3.141.59.jar:?]
    at org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:168) ~[selenium-chrome-driver-3.141.59.jar:?]
    at org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:157) ~[selenium-chrome-driver-3.141.59.jar:?]
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_275]
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_275]
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_275]
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_275]
    at org.apache.commons.lang3.reflect.ConstructorUtils.invokeExactConstructor(ConstructorUtils.java:182) ~[commons-lang3-3.11.jar:3.11]
    at org.apache.commons.lang3.reflect.ConstructorUtils.invokeExactConstructor(ConstructorUtils.java:149) ~[commons-lang3-3.11.jar:3.11]
    at com.norconex.collector.http.fetch.impl.webdriver.Browser$WebDriverBuilder.lambda$build$0(Browser.java:232) ~[norconex-collector-http-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
    at com.norconex.commons.lang.SystemUtil.callWithProperty(SystemUtil.java:118) ~[norconex-commons-lang-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
    at com.norconex.collector.http.fetch.impl.webdriver.Browser$WebDriverBuilder.build(Browser.java:222) ~[norconex-collector-http-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
    ... 12 more

10:14:03.370 [Norconex Minimum Test Page/1] INFO CRAWLER_RUN_THREAD_END - Thread[Norconex Minimum Test Page/1,5,main] 10:14:03.371 [Norconex Minimum Test Page/1] INFO WebDriverHttpFetcher - Shutting down CHROME web driver. Starting ChromeDriver 88.0.4324.96 (68dba2d8a0b149a1d3afac56fa74648032bcf46b-refs/branch-heads/4324@{#1784}) on port 18035 Only local connections are allowed. Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe. ChromeDriver was started successfully. 10:14:03.504 [Norconex Minimum Test Page/2] ERROR Crawler - Problem in thread execution. com.norconex.collector.core.CollectorException: Could not build web driver at com.norconex.collector.http.fetch.impl.webdriver.Browser$WebDriverBuilder.build(Browser.java:237) ~[norconex-collector-http-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] at com.norconex.collector.http.fetch.impl.webdriver.Browser$WebDriverSupplier.get(Browser.java:181) ~[norconex-collector-http-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] at com.norconex.collector.http.fetch.impl.webdriver.WebDriverHolder.getDriver(WebDriverHolder.java:74) ~[norconex-collector-http-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] at com.norconex.collector.http.fetch.impl.webdriver.WebDriverHttpFetcher.fetcherThreadBegin(WebDriverHttpFetcher.java:242) ~[norconex-collector-http-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] at com.norconex.collector.http.fetch.AbstractHttpFetcher.accept(AbstractHttpFetcher.java:127) ~[norconex-collector-http-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] at com.norconex.collector.http.fetch.AbstractHttpFetcher.accept(AbstractHttpFetcher.java:76) ~[norconex-collector-http-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] at com.norconex.commons.lang.event.EventManager.doFire(EventManager.java:136) ~[norconex-commons-lang-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at com.norconex.commons.lang.event.EventManager.fire(EventManager.java:117) ~[norconex-commons-lang-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at com.norconex.commons.lang.event.EventManager.fire(EventManager.java:111) ~[norconex-commons-lang-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at com.norconex.collector.core.crawler.Crawler$ProcessReferencesRunnable.run(Crawler.java:992) [norconex-collector-core-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_275] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_275] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_275] Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_275] at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_275] at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_275] at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_275] at org.apache.commons.lang3.reflect.ConstructorUtils.invokeExactConstructor(ConstructorUtils.java:182) ~[commons-lang3-3.11.jar:3.11] at org.apache.commons.lang3.reflect.ConstructorUtils.invokeExactConstructor(ConstructorUtils.java:149) ~[commons-lang3-3.11.jar:3.11] at com.norconex.collector.http.fetch.impl.webdriver.Browser$WebDriverBuilder.lambda$build$0(Browser.java:232) ~[norconex-collector-http-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] at com.norconex.commons.lang.SystemUtil.callWithProperty(SystemUtil.java:118) ~[norconex-commons-lang-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT] at com.norconex.collector.http.fetch.impl.webdriver.Browser$WebDriverBuilder.build(Browser.java:222) ~[norconex-collector-http-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] ... 12 more Caused by: org.openqa.selenium.WebDriverException: unknown error: Chrome failed to start: exited abnormally. (unknown error: DevToolsActivePort file doesn t exist) (The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.) Build info: version: 3.141.59 , revision: e82be7d358 , time: 2018-11-14T08:17:03 System info: host: bmc-dev , ip: 127.0.1.1 , os.name: Linux , os.arch: amd64 , os.version: 5.4.0-51-generic , java.version: 1.8.0_275 Driver info: driver.version: ChromeDriver remote stacktrace: #0 0x5604e949b199

    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_275]
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_275]
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_275]
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_275]
    at org.openqa.selenium.remote.W3CHandshakeResponse.lambda$errorHandler$0(W3CHandshakeResponse.java:62) ~[selenium-remote-driver-3.141.59.jar:?]
    at org.openqa.selenium.remote.HandshakeResponse.lambda$getResponseFunction$0(HandshakeResponse.java:30) ~[selenium-remote-driver-3.141.59.jar:?]
    at org.openqa.selenium.remote.ProtocolHandshake.lambda$createSession$0(ProtocolHandshake.java:126) ~[selenium-remote-driver-3.141.59.jar:?]
    at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) ~[?:1.8.0_275]
    at java.util.Spliterators$ArraySpliterator.tryAdvance(Spliterators.java:958) ~[?:1.8.0_275]
    at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126) ~[?:1.8.0_275]
    at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:499) ~[?:1.8.0_275]
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:486) ~[?:1.8.0_275]
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) ~[?:1.8.0_275]
    at java.util.stream.FindOps$FindOp.evaluateSequential(FindOps.java:152) ~[?:1.8.0_275]
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:1.8.0_275]
    at java.util.stream.ReferencePipeline.findFirst(ReferencePipeline.java:531) ~[?:1.8.0_275]
    at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:128) ~[selenium-remote-driver-3.141.59.jar:?]
    at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:74) ~[selenium-remote-driver-3.141.59.jar:?]
    at org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:136) ~[selenium-remote-driver-3.141.59.jar:?]
    at org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:83) ~[selenium-remote-driver-3.141.59.jar:?]
    at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:552) ~[selenium-remote-driver-3.141.59.jar:?]
    at org.openqa.selenium.remote.RemoteWebDriver.startSession(RemoteWebDriver.java:213) ~[selenium-remote-driver-3.141.59.jar:?]
    at org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:131) ~[selenium-remote-driver-3.141.59.jar:?]
    at org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:181) ~[selenium-chrome-driver-3.141.59.jar:?]
    at org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:168) ~[selenium-chrome-driver-3.141.59.jar:?]
    at org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:157) ~[selenium-chrome-driver-3.141.59.jar:?]
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_275]
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_275]
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_275]
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_275]
    at org.apache.commons.lang3.reflect.ConstructorUtils.invokeExactConstructor(ConstructorUtils.java:182) ~[commons-lang3-3.11.jar:3.11]
    at org.apache.commons.lang3.reflect.ConstructorUtils.invokeExactConstructor(ConstructorUtils.java:149) ~[commons-lang3-3.11.jar:3.11]
    at com.norconex.collector.http.fetch.impl.webdriver.Browser$WebDriverBuilder.lambda$build$0(Browser.java:232) ~[norconex-collector-http-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
    at com.norconex.commons.lang.SystemUtil.callWithProperty(SystemUtil.java:118) ~[norconex-commons-lang-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
    at com.norconex.collector.http.fetch.impl.webdriver.Browser$WebDriverBuilder.build(Browser.java:222) ~[norconex-collector-http-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
    ... 12 more

`

essiembre commented 3 years ago

Thank you for your logs.

Are you running as root? If so, try with a regular user. Also, try with the full paths to chrome and the chrome driver.

This appears to be the culprit:

(unknown error: DevToolsActivePort file doesn't exist)

I did a bit of research and it appears to be a frequent problem with chrome on Linux. E.g.: https://stackoverflow.com/questions/50790733/unknown-error-devtoolsactiveport-file-doesnt-exist-error-while-executing-selen/50791503

I suggest you try a few of the suggested fixes you get from researching that error online. If some involve passing options via the webcrawling XML configuration, you can do so with:

  <capabilities>
    <capability name="(capability name)">(capability value)</capability>
    <!-- multiple "capability" tags allowed -->
  </capabilities>
OkkeKlein commented 3 years ago

I tried everything I could think of. A bit hard to see what's happening as there is not much logging. Not sure even if the capabilities were passed.

With Firefox as a working alternative I'm gonna put a pin in this one.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.