USCDataScience / sparkler

Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
http://irds.usc.edu/sparkler/
Apache License 2.0
410 stars 143 forks source link

Escape metachars in solr queries #5

Closed thammegowda closed 8 years ago

thammegowda commented 8 years ago
java.lang.RuntimeException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://localhost:8983/solr/crawldb: org.apache.solr.search.SyntaxError: Cannot parse 'group:': Encountered "" at line 1, column 6.
Was expecting one of:
     ...
    "(" ...
    "*" ...
     ...
     ...
     ...
     ...
     ...
    "[" ...
    "{" ...
     ...
    "filter(" ...
     ...

    at edu.usc.irds.sparkler.util.SolrResultIterator.getNextBean(SolrResultIterator.scala:72)
    at edu.usc.irds.sparkler.util.SolrResultIterator.(SolrResultIterator.scala:57)
    at edu.usc.irds.sparkler.CrawlDbRDD.compute(CrawlDbRDD.scala:55)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
thammegowda commented 8 years ago

Fixed in 5070e36107aa33e42ccb447eb673546abd4c0064