USCDataScience / sparkler

Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
http://irds.usc.edu/sparkler/
Apache License 2.0
411 stars 143 forks source link

Unable to run jBrowser plugin #160

Closed micheladennis closed 6 years ago

micheladennis commented 6 years ago

Issue Description

Try to run jBrowser plugin on local computer.

Please describe our issue, along with: Expected behavior Crawl page capture javascript rendering

Encountered Behavior java.lang.NoClassDefFoundError: com/sun/webkit/network/CookieManager at com.machinepublishers.jbrowserdriver.JBrowserDriverServer.main(JBrowserDriverServer.java:74) ... org.openqa.selenium.WebDriverException: Could not launch browser. Build info: version: 'unknown', revision: 'unknown', time: 'unknown' System info: host: 'michelad-Lenovo-YOGA-3-Pro-1370', ip: '127.0.1.1', os.name: 'Linux', os.arch: 'amd64', os.version: '4.4.0-127-generic', java.version: '1.8.0_171' Driver info: driver.version: JBrowserDriver ... Caused by: java.lang.IllegalStateException: Could not launch browser. ... 2018-05-31 12:04:47 WARN ThrowableSerializationWrapper:174 [task-result-getter-1] - Task exception could not be deserialized java.lang.ClassNotFoundException: org.openqa.selenium.WebDriverException ... org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:68) ....

How to reproduce it

  1. Update sparkler-default.yaml
    • Remove "#" in of "#jbrowser" under "plugins.active:".
  2. Console:
    • run the following code in the sparkler home folder: mvn clean package
  3. Inject then Crawl url

Environment and Version Information

Contributing

If you'd like to help us fix the issue by contributing some code, but would like guidance or help in doing so, please mention it!

micheladennis commented 6 years ago

Referencing: https://github.com/MachinePublishers/jBrowserDriver/issues/186

micheladennis commented 6 years ago

Problem solved by installing openjfx is on server. sudo apt-get install openjfx Seems to be an Ubuntu Problem

chrismattmann commented 6 years ago

Thanks should we update our docker @micheladennis ? Is it there too?

Displee commented 4 years ago

I did not get this to work yet with Docker. I tried different approaches.

Approach 1: https://hub.docker.com/r/rburgst/java8-openjfx-docker/dockerfile Approach 2: Java 8 with AlpineLinux as described here https://stackoverflow.com/a/49094932/4653927 Approach 3: https://github.com/rburgst/java8-openjfx-docker/blob/master/Dockerfile

None work. I keep getting the encountered behavior as described by @micheladennis.