lucidworks / spark-solr

Tools for reading data from Solr as a Spark RDD and indexing objects from Spark into Solr using SolrJ.
Apache License 2.0
446 stars 251 forks source link

Retrieving schema details fails #41

Closed hakanilter closed 8 years ago

hakanilter commented 8 years ago

Hello,

I noticed that the code gets "HTTP-414 Request-URI Too long" error when trying to get schema details if the index has lots of fields. Is there any workaround for this issue?

2016-03-21 11:49:37 ERROR SolrQuerySupport:385 - Can't get field metadata from Solr using request 'http://myhost:8983/solr/collection/schema/fields?showDefaults=true&includeDynamic=true&fl=xxx,yyy,zzz,<MORE AND MORE FIELDS>,&wt=json] failed due to: HTTP/1.1 414 Request-URI Too Long: 
    at com.lucidworks.spark.util.SolrJsonSupport$.doJsonRequest(SolrJsonSupport.scala:90)
    at com.lucidworks.spark.util.SolrJsonSupport$.getJson(SolrJsonSupport.scala:71)
    at com.lucidworks.spark.util.SolrJsonSupport$.getJson(SolrJsonSupport.scala:34)
    at com.lucidworks.spark.util.SolrQuerySupport$.getFieldDefinitionsFromSchema(SolrQuerySupport.scala:365)
    at com.lucidworks.spark.util.SolrQuerySupport$.getFieldTypes(SolrQuerySupport.scala:259)
    at com.lucidworks.spark.util.SolrQuerySupport$.getFieldTypes(SolrQuerySupport.scala:253)
    at com.lucidworks.spark.util.SolrSchemaUtil$.getBaseSchema(SolrSchemaUtil.scala:22)
    at com.lucidworks.spark.SolrRelation.<init>(SolrRelation.scala:66)
    at com.lucidworks.spark.SolrRelation.<init>(SolrRelation.scala:37)
    at solr.DefaultSource.createRelation(DefaultSource.scala:12)
    at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
    at example.SolrExample$.main(SolrExample.scala:16)
    at example.SolrExample.main(SolrExample.scala)
kiranchitturi commented 8 years ago

@hakanilter I see that this is a current bug on master branch. I am working on fixing this soon but as a workaround, could you explicitly specify the fields you want to retrieve from Solr ?

For example: The below code retrieves the fields 'id, text' from Solr.

val options = Map(
  "zkHost" -> "localhost:9983",
  "collection" -> "socialdata",
  "fields" -> "id, text")
val df = sqlContext.read.format("solr").options(options).load()
hakanilter commented 8 years ago

@kiranchitturi I did't set fields option, but I also tried to set fields but nothing changed.

The problem occurs when getFieldDefinitionsFromSchema method in SolrQuerySupport class tries to get field details from Solr via GET method. The code should retrieve the field details with more than one request as far as I understand.

hakanilter commented 8 years ago

@kiranchitturi I've changed getFieldDefinitionsFromSchema method name to getFieldDefinitionsFromSchemaPartially and added following method to pass the error:

def getFieldDefinitionsFromSchema(solrUrl: String, fieldNames: Set[String]): Map[String, Any] = {
    var details: Map[String, Any] = Map()
    fieldNames.grouped(10).foreach(fields => {
      details = details ++ getFieldDefinitionsFromSchemaPartially(solrUrl, fields)
    })      
    details
  }

However this time I got a different error:

/solr/product_collection_01_shard2_replica3: Can not search using both cursorMark and timeAllowed

kiranchitturi commented 8 years ago

I did't set the fields option, but I also tried to set fields but nothing changed.

Thanks for commenting on this. I just added a commit that references the schema call to the fields that are provided in the config.

Using the 'fields' config should be a workaround for the 'getUri Too Long' exception

kiranchitturi commented 8 years ago

However this time I got a different error: /solr/product_collection_01_shard2_replica3: Can not search using both cursorMark and timeAllowed

Can you show me the SQL query script that generated this error ?

hakanilter commented 8 years ago

Using the 'fields' config should be a workaround for the 'getUri Too Long' exception

Thanks, it works well.

Can you show me the SQL query script that generated this error ?

timeAllowed parameter is default for our search handlers. Here is my query:

sqlContext.sql("select count(*) from product").collect().foreach(row => println(row.getLong(0)))
kiranchitturi commented 8 years ago

However this time I got a different error: /solr/product_collection_01_shard2_replica3: Can not search using both cursorMark and timeAllowed

Quick googling gave me these two links

val options = Map(
  "zkHost" -> "localhost:9983",
  "collection" -> "socialdata",
  "solr.timeAllowed" -> "0")

Could you build from latest master, try the above and see how it goes ?

hakanilter commented 8 years ago

Could you build from latest master, try the above and see how it goes ?

I tried this with latest master, unfortunately I got the same error.

kiranchitturi commented 8 years ago

Can you share the query and logs from spark client and Solr ?

hakanilter commented 8 years ago

I'm using Spark 1.6.0 by the way, please find my code and the logs below:

    val zkHosts = "zookeeper1:2181,zookeeper2:2181,zookeeper3:2181"
    val collection = "product_collection_01"
    val fields = "productId, title"

    val options = Map(
        "zkHost" -> zkHosts, 
        "collection" -> collection,        
        "fields" -> fields,
        "solr.timeAllowed" -> "0"
    ) 
    val docs = sqlContext.read.format("solr").options(options).load
    docs.registerTempTable("product")

    sqlContext.sql("select count(*) from product").collect().foreach(row => println(row.getLong(0)))

Logs:

2016-03-25 15:27:44 INFO  SparkContext:58 - Running Spark version 1.6.0
2016-03-25 15:27:45 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-03-25 15:27:45 WARN  Utils:70 - Your hostname, mycomputer resolves to a loopback address: 127.0.0.1; using 10.238.236.149 instead (on interface en0)
2016-03-25 15:27:45 WARN  Utils:70 - Set SPARK_LOCAL_IP if you need to bind to another address
2016-03-25 15:27:45 INFO  SecurityManager:58 - Changing view acls to: hakan
2016-03-25 15:27:45 INFO  SecurityManager:58 - Changing modify acls to: hakan
2016-03-25 15:27:45 INFO  SecurityManager:58 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hakan); users with modify permissions: Set(hakan)
2016-03-25 15:27:46 INFO  Utils:58 - Successfully started service 'sparkDriver' on port 65398.
2016-03-25 15:27:46 INFO  Slf4jLogger:80 - Slf4jLogger started
2016-03-25 15:27:46 INFO  Remoting:74 - Starting remoting
2016-03-25 15:27:46 INFO  Remoting:74 - Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@10.238.236.149:65399]
2016-03-25 15:27:46 INFO  Utils:58 - Successfully started service 'sparkDriverActorSystem' on port 65399.
2016-03-25 15:27:46 INFO  SparkEnv:58 - Registering MapOutputTracker
2016-03-25 15:27:47 INFO  SparkEnv:58 - Registering BlockManagerMaster
2016-03-25 15:27:47 INFO  DiskBlockManager:58 - Created local directory at /private/var/folders/tm/g1rplzrd2qg33f2_k9tcn01438z5jv/T/blockmgr-2b41dee8-046f-4782-94d0-661fbfd9f006
2016-03-25 15:27:47 INFO  MemoryStore:58 - MemoryStore started with capacity 2.4 GB
2016-03-25 15:27:47 INFO  SparkEnv:58 - Registering OutputCommitCoordinator
2016-03-25 15:27:47 INFO  Server:272 - jetty-8.y.z-SNAPSHOT
2016-03-25 15:27:47 INFO  AbstractConnector:338 - Started SelectChannelConnector@0.0.0.0:4040
2016-03-25 15:27:47 INFO  Utils:58 - Successfully started service 'SparkUI' on port 4040.
2016-03-25 15:27:47 INFO  SparkUI:58 - Started SparkUI at http://10.238.236.149:4040
2016-03-25 15:27:47 INFO  Executor:58 - Starting executor ID driver on host localhost
2016-03-25 15:27:47 INFO  Utils:58 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 65400.
2016-03-25 15:27:47 INFO  NettyBlockTransferService:58 - Server created on 65400
2016-03-25 15:27:47 INFO  BlockManagerMaster:58 - Trying to register BlockManager
2016-03-25 15:27:47 INFO  BlockManagerMasterEndpoint:58 - Registering block manager localhost:65400 with 2.4 GB RAM, BlockManagerId(driver, localhost, 65400)
2016-03-25 15:27:47 INFO  BlockManagerMaster:58 - Registered BlockManager
2016-03-25 15:27:48 INFO  SolrZkClient:211 - Using default ZkCredentialsProvider
2016-03-25 15:27:48 INFO  ZooKeeper:100 - Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
2016-03-25 15:27:48 INFO  ZooKeeper:100 - Client environment:host.name=localhost
2016-03-25 15:27:48 INFO  ZooKeeper:100 - Client environment:java.version=1.7.0_75
2016-03-25 15:27:48 INFO  ZooKeeper:100 - Client environment:java.vendor=Oracle Corporation
2016-03-25 15:27:48 INFO  ZooKeeper:100 - Client environment:java.home=/Library/Java/JavaVirtualMachines/jdk1.7.0_75.jdk/Contents/Home/jre
2016-03-25 15:27:48 INFO  ZooKeeper:100 - Client environment:java.class.path=/Users/hakan/dev/workspace/git/spark-jobs/target/classes:/Users/hakan/dev/workspace/git/spark-jobs/target/test-classes:/Users/hakan/.m2/repository/org/apache/spark/spark-core_2.10/1.6.0/spark-core_2.10-1.6.0.jar:/Users/hakan/.m2/repository/org/apache/avro/avro-mapred/1.7.7/avro-mapred-1.7.7-hadoop2.jar:/Users/hakan/.m2/repository/org/apache/avro/avro-ipc/1.7.7/avro-ipc-1.7.7.jar:/Users/hakan/.m2/repository/org/apache/avro/avro-ipc/1.7.7/avro-ipc-1.7.7-tests.jar:/Users/hakan/.m2/repository/org/codehaus/jackson/jackson-core-asl/1.9.13/jackson-core-asl-1.9.13.jar:/Users/hakan/.m2/repository/com/twitter/chill_2.10/0.5.0/chill_2.10-0.5.0.jar:/Users/hakan/.m2/repository/com/esotericsoftware/kryo/kryo/2.21/kryo-2.21.jar:/Users/hakan/.m2/repository/com/esotericsoftware/reflectasm/reflectasm/1.07/reflectasm-1.07-shaded.jar:/Users/hakan/.m2/repository/com/esotericsoftware/minlog/minlog/1.2/minlog-1.2.jar:/Users/hakan/.m2/repository/org/objenesis/objenesis/1.2/objenesis-1.2.jar:/Users/hakan/.m2/repository/com/twitter/chill-java/0.5.0/chill-java-0.5.0.jar:/Users/hakan/.m2/repository/org/apache/xbean/xbean-asm5-shaded/4.4/xbean-asm5-shaded-4.4.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-launcher_2.10/1.6.0/spark-launcher_2.10-1.6.0.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-network-common_2.10/1.6.0/spark-network-common_2.10-1.6.0.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-network-shuffle_2.10/1.6.0/spark-network-shuffle_2.10-1.6.0.jar:/Users/hakan/.m2/repository/org/fusesource/leveldbjni/leveldbjni-all/1.8/leveldbjni-all-1.8.jar:/Users/hakan/.m2/repository/com/fasterxml/jackson/core/jackson-annotations/2.4.4/jackson-annotations-2.4.4.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-unsafe_2.10/1.6.0/spark-unsafe_2.10-1.6.0.jar:/Users/hakan/.m2/repository/net/java/dev/jets3t/jets3t/0.7.1/jets3t-0.7.1.jar:/Users/hakan/.m2/repository/org/apache/curator/curator-recipes/2.4.0/curator-recipes-2.4.0.jar:/Users/hakan/.m2/repository/org/apache/curator/curator-framework/2.4.0/curator-framework-2.4.0.jar:/Users/hakan/.m2/repository/com/google/guava/guava/14.0.1/guava-14.0.1.jar:/Users/hakan/.m2/repository/org/eclipse/jetty/orbit/javax.servlet/3.0.0.v201112011016/javax.servlet-3.0.0.v201112011016.jar:/Users/hakan/.m2/repository/org/apache/commons/commons-lang3/3.3.2/commons-lang3-3.3.2.jar:/Users/hakan/.m2/repository/org/apache/commons/commons-math3/3.4.1/commons-math3-3.4.1.jar:/Users/hakan/.m2/repository/com/google/code/findbugs/jsr305/1.3.9/jsr305-1.3.9.jar:/Users/hakan/.m2/repository/org/slf4j/jul-to-slf4j/1.7.10/jul-to-slf4j-1.7.10.jar:/Users/hakan/.m2/repository/org/slf4j/jcl-over-slf4j/1.7.10/jcl-over-slf4j-1.7.10.jar:/Users/hakan/.m2/repository/log4j/log4j/1.2.17/log4j-1.2.17.jar:/Users/hakan/.m2/repository/com/ning/compress-lzf/1.0.3/compress-lzf-1.0.3.jar:/Users/hakan/.m2/repository/net/jpountz/lz4/lz4/1.3.0/lz4-1.3.0.jar:/Users/hakan/.m2/repository/org/roaringbitmap/RoaringBitmap/0.5.11/RoaringBitmap-0.5.11.jar:/Users/hakan/.m2/repository/commons-net/commons-net/2.2/commons-net-2.2.jar:/Users/hakan/.m2/repository/com/typesafe/akka/akka-remote_2.10/2.3.11/akka-remote_2.10-2.3.11.jar:/Users/hakan/.m2/repository/com/typesafe/akka/akka-actor_2.10/2.3.11/akka-actor_2.10-2.3.11.jar:/Users/hakan/.m2/repository/com/typesafe/config/1.2.1/config-1.2.1.jar:/Users/hakan/.m2/repository/io/netty/netty/3.8.0.Final/netty-3.8.0.Final.jar:/Users/hakan/.m2/repository/com/google/protobuf/protobuf-java/2.5.0/protobuf-java-2.5.0.jar:/Users/hakan/.m2/repository/org/uncommons/maths/uncommons-maths/1.2.2a/uncommons-maths-1.2.2a.jar:/Users/hakan/.m2/repository/com/typesafe/akka/akka-slf4j_2.10/2.3.11/akka-slf4j_2.10-2.3.11.jar:/Users/hakan/.m2/repository/org/json4s/json4s-jackson_2.10/3.2.10/json4s-jackson_2.10-3.2.10.jar:/Users/hakan/.m2/repository/org/json4s/json4s-core_2.10/3.2.10/json4s-core_2.10-3.2.10.jar:/Users/hakan/.m2/repository/org/json4s/json4s-ast_2.10/3.2.10/json4s-ast_2.10-3.2.10.jar:/Users/hakan/.m2/repository/org/scala-lang/scalap/2.10.0/scalap-2.10.0.jar:/Users/hakan/.m2/repository/org/scala-lang/scala-compiler/2.10.0/scala-compiler-2.10.0.jar:/Users/hakan/.m2/repository/com/sun/jersey/jersey-server/1.9/jersey-server-1.9.jar:/Users/hakan/.m2/repository/asm/asm/3.1/asm-3.1.jar:/Users/hakan/.m2/repository/com/sun/jersey/jersey-core/1.9/jersey-core-1.9.jar:/Users/hakan/.m2/repository/org/apache/mesos/mesos/0.21.1/mesos-0.21.1-shaded-protobuf.jar:/Users/hakan/.m2/repository/io/netty/netty-all/4.0.29.Final/netty-all-4.0.29.Final.jar:/Users/hakan/.m2/repository/com/clearspring/analytics/stream/2.7.0/stream-2.7.0.jar:/Users/hakan/.m2/repository/io/dropwizard/metrics/metrics-core/3.1.2/metrics-core-3.1.2.jar:/Users/hakan/.m2/repository/io/dropwizard/metrics/metrics-jvm/3.1.2/metrics-jvm-3.1.2.jar:/Users/hakan/.m2/repository/io/dropwizard/metrics/metrics-json/3.1.2/metrics-json-3.1.2.jar:/Users/hakan/.m2/repository/io/dropwizard/metrics/metrics-graphite/3.1.2/metrics-graphite-3.1.2.jar:/Users/hakan/.m2/repository/com/fasterxml/jackson/core/jackson-databind/2.4.4/jackson-databind-2.4.4.jar:/Users/hakan/.m2/repository/com/fasterxml/jackson/core/jackson-core/2.4.4/jackson-core-2.4.4.jar:/Users/hakan/.m2/repository/com/fasterxml/jackson/module/jackson-module-scala_2.10/2.4.4/jackson-module-scala_2.10-2.4.4.jar:/Users/hakan/.m2/repository/org/scala-lang/scala-reflect/2.10.4/scala-reflect-2.10.4.jar:/Users/hakan/.m2/repository/com/thoughtworks/paranamer/paranamer/2.6/paranamer-2.6.jar:/Users/hakan/.m2/repository/org/apache/ivy/ivy/2.4.0/ivy-2.4.0.jar:/Users/hakan/.m2/repository/oro/oro/2.0.8/oro-2.0.8.jar:/Users/hakan/.m2/repository/org/tachyonproject/tachyon-client/0.8.2/tachyon-client-0.8.2.jar:/Users/hakan/.m2/repository/commons-io/commons-io/2.4/commons-io-2.4.jar:/Users/hakan/.m2/repository/org/tachyonproject/tachyon-underfs-hdfs/0.8.2/tachyon-underfs-hdfs-0.8.2.jar:/Users/hakan/.m2/repository/org/tachyonproject/tachyon-underfs-s3/0.8.2/tachyon-underfs-s3-0.8.2.jar:/Users/hakan/.m2/repository/org/tachyonproject/tachyon-underfs-local/0.8.2/tachyon-underfs-local-0.8.2.jar:/Users/hakan/.m2/repository/net/razorvine/pyrolite/4.9/pyrolite-4.9.jar:/Users/hakan/.m2/repository/net/sf/py4j/py4j/0.9/py4j-0.9.jar:/Users/hakan/.m2/repository/org/spark-project/spark/unused/1.0.0/unused-1.0.0.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-sql_2.10/1.6.0/spark-sql_2.10-1.6.0.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-catalyst_2.10/1.6.0/spark-catalyst_2.10-1.6.0.jar:/Users/hakan/.m2/repository/org/codehaus/janino/janino/2.7.8/janino-2.7.8.jar:/Users/hakan/.m2/repository/org/apache/parquet/parquet-column/1.7.0/parquet-column-1.7.0.jar:/Users/hakan/.m2/repository/org/apache/parquet/parquet-common/1.7.0/parquet-common-1.7.0.jar:/Users/hakan/.m2/repository/org/apache/parquet/parquet-encoding/1.7.0/parquet-encoding-1.7.0.jar:/Users/hakan/.m2/repository/org/apache/parquet/parquet-generator/1.7.0/parquet-generator-1.7.0.jar:/Users/hakan/.m2/repository/org/apache/parquet/parquet-hadoop/1.7.0/parquet-hadoop-1.7.0.jar:/Users/hakan/.m2/repository/org/apache/parquet/parquet-format/2.3.0-incubating/parquet-format-2.3.0-incubating.jar:/Users/hakan/.m2/repository/org/apache/parquet/parquet-jackson/1.7.0/parquet-jackson-1.7.0.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-hive_2.10/1.6.0/spark-hive_2.10-1.6.0.jar:/Users/hakan/.m2/repository/com/twitter/parquet-hadoop-bundle/1.6.0/parquet-hadoop-bundle-1.6.0.jar:/Users/hakan/.m2/repository/org/spark-project/hive/hive-exec/1.2.1.spark/hive-exec-1.2.1.spark.jar:/Users/hakan/.m2/repository/javolution/javolution/5.5.1/javolution-5.5.1.jar:/Users/hakan/.m2/repository/log4j/apache-log4j-extras/1.2.17/apache-log4j-extras-1.2.17.jar:/Users/hakan/.m2/repository/org/antlr/antlr-runtime/3.4/antlr-runtime-3.4.jar:/Users/hakan/.m2/repository/org/antlr/stringtemplate/3.2.1/stringtemplate-3.2.1.jar:/Users/hakan/.m2/repository/antlr/antlr/2.7.7/antlr-2.7.7.jar:/Users/hakan/.m2/repository/org/antlr/ST4/4.0.4/ST4-4.0.4.jar:/Users/hakan/.m2/repository/org/apache/commons/commons-compress/1.4.1/commons-compress-1.4.1.jar:/Users/hakan/.m2/repository/org/tukaani/xz/1.0/xz-1.0.jar:/Users/hakan/.m2/repository/org/codehaus/groovy/groovy-all/2.1.6/groovy-all-2.1.6.jar:/Users/hakan/.m2/repository/com/googlecode/javaewah/JavaEWAH/0.3.2/JavaEWAH-0.3.2.jar:/Users/hakan/.m2/repository/org/iq80/snappy/snappy/0.2/snappy-0.2.jar:/Users/hakan/.m2/repository/org/json/json/20090211/json-20090211.jar:/Users/hakan/.m2/repository/stax/stax-api/1.0.1/stax-api-1.0.1.jar:/Users/hakan/.m2/repository/net/sf/opencsv/opencsv/2.3/opencsv-2.3.jar:/Users/hakan/.m2/repository/jline/jline/2.12/jline-2.12.jar:/Users/hakan/.m2/repository/org/spark-project/hive/hive-metastore/1.2.1.spark/hive-metastore-1.2.1.spark.jar:/Users/hakan/.m2/repository/com/jolbox/bonecp/0.8.0.RELEASE/bonecp-0.8.0.RELEASE.jar:/Users/hakan/.m2/repository/commons-cli/commons-cli/1.2/commons-cli-1.2.jar:/Users/hakan/.m2/repository/commons-logging/commons-logging/1.1.3/commons-logging-1.1.3.jar:/Users/hakan/.m2/repository/org/apache/derby/derby/10.10.2.0/derby-10.10.2.0.jar:/Users/hakan/.m2/repository/org/datanucleus/datanucleus-api-jdo/3.2.6/datanucleus-api-jdo-3.2.6.jar:/Users/hakan/.m2/repository/org/datanucleus/datanucleus-rdbms/3.2.9/datanucleus-rdbms-3.2.9.jar:/Users/hakan/.m2/repository/commons-pool/commons-pool/1.5.4/commons-pool-1.5.4.jar:/Users/hakan/.m2/repository/commons-dbcp/commons-dbcp/1.4/commons-dbcp-1.4.jar:/Users/hakan/.m2/repository/javax/jdo/jdo-api/3.0.1/jdo-api-3.0.1.jar:/Users/hakan/.m2/repository/javax/transaction/jta/1.1/jta-1.1.jar:/Users/hakan/.m2/repository/org/apache/avro/avro/1.7.7/avro-1.7.7.jar:/Users/hakan/.m2/repository/commons-httpclient/commons-httpclient/3.1/commons-httpclient-3.1.jar:/Users/hakan/.m2/repository/org/apache/calcite/calcite-avatica/1.2.0-incubating/calcite-avatica-1.2.0-incubating.jar:/Users/hakan/.m2/repository/org/apache/calcite/calcite-core/1.2.0-incubating/calcite-core-1.2.0-incubating.jar:/Users/hakan/.m2/repository/org/apache/calcite/calcite-linq4j/1.2.0-incubating/calcite-linq4j-1.2.0-incubating.jar:/Users/hakan/.m2/repository/net/hydromatic/eigenbase-properties/1.1.5/eigenbase-properties-1.1.5.jar:/Users/hakan/.m2/repository/org/codehaus/janino/commons-compiler/2.7.6/commons-compiler-2.7.6.jar:/Users/hakan/.m2/repository/org/apache/httpcomponents/httpclient/4.3.2/httpclient-4.3.2.jar:/Users/hakan/.m2/repository/org/apache/httpcomponents/httpcore/4.3.1/httpcore-4.3.1.jar:/Users/hakan/.m2/repository/org/codehaus/jackson/jackson-mapper-asl/1.9.13/jackson-mapper-asl-1.9.13.jar:/Users/hakan/.m2/repository/commons-codec/commons-codec/1.10/commons-codec-1.10.jar:/Users/hakan/.m2/repository/joda-time/joda-time/2.9/joda-time-2.9.jar:/Users/hakan/.m2/repository/org/jodd/jodd-core/3.5.2/jodd-core-3.5.2.jar:/Users/hakan/.m2/repository/org/datanucleus/datanucleus-core/3.2.10/datanucleus-core-3.2.10.jar:/Users/hakan/.m2/repository/org/apache/thrift/libthrift/0.9.2/libthrift-0.9.2.jar:/Users/hakan/.m2/repository/org/apache/thrift/libfb303/0.9.2/libfb303-0.9.2.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-mllib_2.10/1.6.0/spark-mllib_2.10-1.6.0.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-streaming_2.10/1.6.0/spark-streaming_2.10-1.6.0.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-graphx_2.10/1.6.0/spark-graphx_2.10-1.6.0.jar:/Users/hakan/.m2/repository/com/github/fommil/netlib/core/1.1.2/core-1.1.2.jar:/Users/hakan/.m2/repository/net/sourceforge/f2j/arpack_combined_all/0.1/arpack_combined_all-0.1.jar:/Users/hakan/.m2/repository/org/scalanlp/breeze_2.10/0.11.2/breeze_2.10-0.11.2.jar:/Users/hakan/.m2/repository/org/scalanlp/breeze-macros_2.10/0.11.2/breeze-macros_2.10-0.11.2.jar:/Users/hakan/.m2/repository/org/scalamacros/quasiquotes_2.10/2.0.0-M8/quasiquotes_2.10-2.0.0-M8.jar:/Users/hakan/.m2/repository/com/github/rwl/jtransforms/2.4.0/jtransforms-2.4.0.jar:/Users/hakan/.m2/repository/org/spire-math/spire_2.10/0.7.4/spire_2.10-0.7.4.jar:/Users/hakan/.m2/repository/org/spire-math/spire-macros_2.10/0.7.4/spire-macros_2.10-0.7.4.jar:/Users/hakan/.m2/repository/org/jpmml/pmml-model/1.1.15/pmml-model-1.1.15.jar:/Users/hakan/.m2/repository/org/jpmml/pmml-agent/1.1.15/pmml-agent-1.1.15.jar:/Users/hakan/.m2/repository/org/jpmml/pmml-schema/1.1.15/pmml-schema-1.1.15.jar:/Users/hakan/.m2/repository/com/sun/xml/bind/jaxb-impl/2.2.7/jaxb-impl-2.2.7.jar:/Users/hakan/.m2/repository/com/sun/xml/bind/jaxb-core/2.2.7/jaxb-core-2.2.7.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-streaming-kafka_2.10/1.6.0/spark-streaming-kafka_2.10-1.6.0.jar:/Users/hakan/.m2/repository/org/apache/kafka/kafka_2.10/0.8.2.1/kafka_2.10-0.8.2.1.jar:/Users/hakan/.m2/repository/com/yammer/metrics/metrics-core/2.2.0/metrics-core-2.2.0.jar:/Users/hakan/.m2/repository/org/apache/kafka/kafka-clients/0.8.2.1/kafka-clients-0.8.2.1.jar:/Users/hakan/.m2/repository/org/apache/zookeeper/zookeeper/3.4.6/zookeeper-3.4.6.jar:/Users/hakan/.m2/repository/net/sf/jopt-simple/jopt-simple/3.2/jopt-simple-3.2.jar:/Users/hakan/.m2/repository/com/101tec/zkclient/0.3/zkclient-0.3.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-client/2.5.0-cdh5.3.3/hadoop-client-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-common/2.5.0-cdh5.3.3/hadoop-common-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/xmlenc/xmlenc/0.52/xmlenc-0.52.jar:/Users/hakan/.m2/repository/commons-collections/commons-collections/3.2.1/commons-collections-3.2.1.jar:/Users/hakan/.m2/repository/commons-configuration/commons-configuration/1.6/commons-configuration-1.6.jar:/Users/hakan/.m2/repository/commons-digester/commons-digester/1.8/commons-digester-1.8.jar:/Users/hakan/.m2/repository/commons-beanutils/commons-beanutils/1.7.0/commons-beanutils-1.7.0.jar:/Users/hakan/.m2/repository/commons-beanutils/commons-beanutils-core/1.8.0/commons-beanutils-core-1.8.0.jar:/Users/hakan/.m2/repository/com/google/code/gson/gson/2.2.4/gson-2.2.4.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-auth/2.5.0-cdh5.3.3/hadoop-auth-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/org/apache/directory/server/apacheds-kerberos-codec/2.0.0-M15/apacheds-kerberos-codec-2.0.0-M15.jar:/Users/hakan/.m2/repository/org/apache/directory/server/apacheds-i18n/2.0.0-M15/apacheds-i18n-2.0.0-M15.jar:/Users/hakan/.m2/repository/org/apache/directory/api/api-asn1-api/1.0.0-M20/api-asn1-api-1.0.0-M20.jar:/Users/hakan/.m2/repository/org/apache/directory/api/api-util/1.0.0-M20/api-util-1.0.0-M20.jar:/Users/hakan/.m2/repository/org/apache/curator/curator-client/2.6.0/curator-client-2.6.0.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-hdfs/2.5.0-cdh5.3.3/hadoop-hdfs-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/org/mortbay/jetty/jetty-util/6.1.26.cloudera.4/jetty-util-6.1.26.cloudera.4.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-app/2.5.0-cdh5.3.3/hadoop-mapreduce-client-app-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-common/2.5.0-cdh5.3.3/hadoop-mapreduce-client-common-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-yarn-client/2.5.0-cdh5.3.3/hadoop-yarn-client-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/com/sun/jersey/jersey-client/1.9/jersey-client-1.9.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-yarn-server-common/2.5.0-cdh5.3.3/hadoop-yarn-server-common-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-shuffle/2.5.0-cdh5.3.3/hadoop-mapreduce-client-shuffle-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-yarn-api/2.5.0-cdh5.3.3/hadoop-yarn-api-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-core/2.5.0-cdh5.3.3/hadoop-mapreduce-client-core-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-yarn-common/2.5.0-cdh5.3.3/hadoop-yarn-common-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/javax/xml/bind/jaxb-api/2.2.2/jaxb-api-2.2.2.jar:/Users/hakan/.m2/repository/javax/xml/stream/stax-api/1.0-2/stax-api-1.0-2.jar:/Users/hakan/.m2/repository/javax/activation/activation/1.1/activation-1.1.jar:/Users/hakan/.m2/repository/javax/servlet/servlet-api/2.5/servlet-api-2.5.jar:/Users/hakan/.m2/repository/org/codehaus/jackson/jackson-jaxrs/1.8.8/jackson-jaxrs-1.8.8.jar:/Users/hakan/.m2/repository/org/codehaus/jackson/jackson-xc/1.8.8/jackson-xc-1.8.8.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-jobclient/2.5.0-cdh5.3.3/hadoop-mapreduce-client-jobclient-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-aws/2.5.0-cdh5.3.3/hadoop-aws-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/com/amazonaws/aws-java-sdk/1.7.4/aws-java-sdk-1.7.4.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-annotations/2.5.0-cdh5.3.3/hadoop-annotations-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/mysql/mysql-connector-java/5.1.34/mysql-connector-java-5.1.34.jar:/Users/hakan/.m2/repository/net/zemberek/zemberek-cekirdek/2.1.3.1/zemberek-cekirdek-2.1.3.1.jar:/Users/hakan/.m2/repository/net/zemberek/zemberek-tr/2.1.3/zemberek-tr-2.1.3.jar:/Users/hakan/.m2/repository/org/xerial/snappy/snappy-java/1.0.5/snappy-java-1.0.5.jar:/Users/hakan/.m2/repository/commons-lang/commons-lang/2.6/commons-lang-2.6.jar:/Users/hakan/.m2/repository/org/slf4j/slf4j-api/1.7.10/slf4j-api-1.7.10.jar:/Users/hakan/.m2/repository/org/slf4j/slf4j-log4j12/1.7.10/slf4j-log4j12-1.7.10.jar:/Users/hakan/.m2/repository/junit/junit/4.11/junit-4.11.jar:/Users/hakan/.m2/repository/org/hamcrest/hamcrest-core/1.3/hamcrest-core-1.3.jar:/Users/hakan/.m2/repository/org/specs2/specs2_2.10/1.13/specs2_2.10-1.13.jar:/Users/hakan/.m2/repository/org/specs2/scalaz-core_2.10/7.0.0/scalaz-core_2.10-7.0.0.jar:/Users/hakan/.m2/repository/org/specs2/scalaz-concurrent_2.10/7.0.0/scalaz-concurrent_2.10-7.0.0.jar:/Users/hakan/.m2/repository/org/specs2/scalaz-effect_2.10/7.0.0/scalaz-effect_2.10-7.0.0.jar:/Users/hakan/.m2/repository/org/scalatest/scalatest_2.10/2.0.M6-SNAP8/scalatest_2.10-2.0.M6-SNAP8.jar:/Users/hakan/dev/workspace/github/spark-solr/target/spark-solr-2.0.0-SNAPSHOT-shaded.jar
2016-03-25 15:27:48 INFO  ZooKeeper:100 - Client environment:java.library.path=/Users/hakan/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:.
2016-03-25 15:27:48 INFO  ZooKeeper:100 - Client environment:java.io.tmpdir=/var/folders/tm/g1rplzrd2qg33f2_k9tcn01438z5jv/T/
2016-03-25 15:27:48 INFO  ZooKeeper:100 - Client environment:java.compiler=<NA>
2016-03-25 15:27:48 INFO  ZooKeeper:100 - Client environment:os.name=Mac OS X
2016-03-25 15:27:48 INFO  ZooKeeper:100 - Client environment:os.arch=x86_64
2016-03-25 15:27:48 INFO  ZooKeeper:100 - Client environment:os.version=10.10.5
2016-03-25 15:27:48 INFO  ZooKeeper:100 - Client environment:user.name=hakan
2016-03-25 15:27:48 INFO  ZooKeeper:100 - Client environment:user.home=/Users/hakan
2016-03-25 15:27:48 INFO  ZooKeeper:100 - Client environment:user.dir=/Users/hakan/dev/workspace/git/spark-jobs
2016-03-25 15:27:48 INFO  ZooKeeper:438 - Initiating client connection, connectString=zookeeper1:2181,zookeeper2:2181,zookeeper3:2181 sessionTimeout=10000 watcher=org.apache.solr.common.cloud.SolrZkClient$3@75ea136c
2016-03-25 15:28:00 INFO  ConnectionManager:192 - Waiting for client to connect to ZooKeeper
2016-03-25 15:28:00 INFO  ClientCnxn:975 - Opening socket connection to server 10.35.75.17/10.35.75.17:2181. Will not attempt to authenticate using SASL (unknown error)
2016-03-25 15:28:00 INFO  ClientCnxn:852 - Socket connection established to 10.35.75.17/10.35.75.17:2181, initiating session
2016-03-25 15:28:00 INFO  ClientCnxn:1235 - Session establishment complete on server 10.35.75.17/10.35.75.17:2181, sessionid = 0x15349f1bec3003c, negotiated timeout = 10000
2016-03-25 15:28:00 INFO  ConnectionManager:102 - Watcher org.apache.solr.common.cloud.ConnectionManager@6072345f name:ZooKeeperConnection Watcher:zookeeper1:2181,zookeeper2:2181,zookeeper3:2181 got event WatchedEvent state:SyncConnected type:None path:null path:null type:None
2016-03-25 15:28:00 INFO  ConnectionManager:210 - Client is connected to ZooKeeper
2016-03-25 15:28:00 INFO  SolrZkClient:227 - Using default ZkACLProvider
2016-03-25 15:28:00 INFO  ZkStateReader:279 - Updating cluster state from ZooKeeper... 
2016-03-25 15:28:05 INFO  SolrRelation:121 - Constructed SolrQuery: q=*%3A*&rows=1000&fl=productId%2Ctitle&timeallowed=0&collection=product_collection_01
2016-03-25 15:29:14 INFO  SolrRDD:95 - Found 3 partitions: [Lorg.apache.spark.Partition;@3a6fd169
2016-03-25 15:29:14 INFO  SparkContext:58 - Starting job: collect at SolrExample.scala:25
2016-03-25 15:29:14 INFO  DAGScheduler:58 - Registering RDD 5 (collect at SolrExample.scala:25)
2016-03-25 15:29:14 INFO  DAGScheduler:58 - Got job 0 (collect at SolrExample.scala:25) with 1 output partitions
2016-03-25 15:29:14 INFO  DAGScheduler:58 - Final stage: ResultStage 1 (collect at SolrExample.scala:25)
2016-03-25 15:29:14 INFO  DAGScheduler:58 - Parents of final stage: List(ShuffleMapStage 0)
2016-03-25 15:29:14 INFO  DAGScheduler:58 - Missing parents: List(ShuffleMapStage 0)
2016-03-25 15:29:14 INFO  DAGScheduler:58 - Submitting ShuffleMapStage 0 (MapPartitionsRDD[5] at collect at SolrExample.scala:25), which has no missing parents
2016-03-25 15:29:14 INFO  MemoryStore:58 - Block broadcast_0 stored as values in memory (estimated size 10.4 KB, free 10.4 KB)
2016-03-25 15:29:14 INFO  MemoryStore:58 - Block broadcast_0_piece0 stored as bytes in memory (estimated size 5.4 KB, free 15.8 KB)
2016-03-25 15:29:14 INFO  BlockManagerInfo:58 - Added broadcast_0_piece0 in memory on localhost:65400 (size: 5.4 KB, free: 2.4 GB)
2016-03-25 15:29:14 INFO  SparkContext:58 - Created broadcast 0 from broadcast at DAGScheduler.scala:1006
2016-03-25 15:29:14 INFO  DAGScheduler:58 - Submitting 3 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[5] at collect at SolrExample.scala:25)
2016-03-25 15:29:14 INFO  TaskSchedulerImpl:58 - Adding task set 0.0 with 3 tasks
2016-03-25 15:29:14 INFO  TaskSetManager:58 - Starting task 0.0 in stage 0.0 (TID 0, localhost, partition 0,PROCESS_LOCAL, 4595 bytes)
2016-03-25 15:29:14 INFO  TaskSetManager:58 - Starting task 1.0 in stage 0.0 (TID 1, localhost, partition 1,PROCESS_LOCAL, 4847 bytes)
2016-03-25 15:29:14 INFO  TaskSetManager:58 - Starting task 2.0 in stage 0.0 (TID 2, localhost, partition 2,PROCESS_LOCAL, 4595 bytes)
2016-03-25 15:29:14 INFO  Executor:58 - Running task 1.0 in stage 0.0 (TID 1)
2016-03-25 15:29:14 INFO  Executor:58 - Running task 0.0 in stage 0.0 (TID 0)
2016-03-25 15:29:14 INFO  Executor:58 - Running task 2.0 in stage 0.0 (TID 2)
2016-03-25 15:29:14 INFO  SolrRDD:64 - Computing the partition 1 on host name localhost
2016-03-25 15:29:14 INFO  SolrRDD:64 - Computing the partition 2 on host name localhost
2016-03-25 15:29:14 INFO  SolrRDD:69 - Using the shard url http://solr02.host.net:8983/solr/product_collection_01_shard2_replica2/ for getting partition data for split: ShardRDDPartition(1,*,SolrShard(shard2,List(SolrReplica(0) product_collection_01_shard2_replica2: url=http://solr02.host.net:8983/solr/product_collection_01_shard2_replica2/, hostName=solr02.host.net:8983_solr, locations=solr02.host.net/10.35.74.133, SolrReplica(0) product_collection_01_shard2_replica1: url=http://solr05.host.net:8983/solr/product_collection_01_shard2_replica1/, hostName=solr05.host.net:8983_solr, locations=solr05.host.net/10.35.74.196, SolrReplica(0) product_collection_01_shard2_replica3: url=http://solr08.host.net:8983/solr/product_collection_01_shard2_replica3/, hostName=solr08.host.net:8983_solr, locations=solr08.host.net/10.35.74.208, SolrReplica(0) product_collection_01_shard2_replica4: url=http://solr11.host.net:8983/solr/product_collection_01_shard2_replica4/, hostName=solr11.host.net:8983_solr, locations=solr11.host.net/10.35.74.211, SolrReplica(0) product_collection_01_shard2_replica5: url=http://solr14.host.net:8983/solr/product_collection_01_shard2_replica5/, hostName=solr14.host.net:8983_solr, locations=solr14.host.net/10.35.75.14, SolrReplica(0) product_collection_01_shard2_replica6: url=http://solr16.host.net:8983/solr/product_collection_01_shard2_replica6/, hostName=solr16.host.net:8983_solr, locations=solr16.host.net/10.35.75.16)),q=*%3A*&rows=1000&fl=productId%2Ctitle&timeallowed=0&collection=product_collection_01&distrib=false&start=0&sort=productId+asc,SolrReplica(0) product_collection_01_shard2_replica2: url=http://solr02.host.net:8983/solr/product_collection_01_shard2_replica2/, hostName=solr02.host.net:8983_solr, locations=solr02.host.net/10.35.74.133)
2016-03-25 15:29:14 INFO  SolrRDD:69 - Using the shard url http://solr12.host.net:8983/solr/product_collection_01_shard3_replica4/ for getting partition data for split: ShardRDDPartition(2,*,SolrShard(shard3,List(SolrReplica(0) product_collection_01_shard3_replica2: url=http://solr03.host.net:8983/solr/product_collection_01_shard3_replica2/, hostName=solr03.host.net:8983_solr, locations=solr03.host.net/10.35.74.179, SolrReplica(0) product_collection_01_shard3_replica1: url=http://solr06.host.net:8983/solr/product_collection_01_shard3_replica1/, hostName=solr06.host.net:8983_solr, locations=solr06.host.net/10.35.74.197, SolrReplica(0) product_collection_01_shard3_replica3: url=http://solr09.host.net:8983/solr/product_collection_01_shard3_replica3/, hostName=solr09.host.net:8983_solr, locations=solr09.host.net/10.35.74.209, SolrReplica(0) product_collection_01_shard3_replica4: url=http://solr12.host.net:8983/solr/product_collection_01_shard3_replica4/, hostName=solr12.host.net:8983_solr, locations=solr12.host.net/10.35.74.212, SolrReplica(0) product_collection_01_shard3_replica5: url=http://solr15.host.net:8983/solr/product_collection_01_shard3_replica5/, hostName=solr15.host.net:8983_solr, locations=solr15.host.net/10.35.75.15)),q=*%3A*&rows=1000&fl=productId%2Ctitle&timeallowed=0&collection=product_collection_01&distrib=false&start=0&sort=productId+asc,SolrReplica(0) product_collection_01_shard3_replica4: url=http://solr12.host.net:8983/solr/product_collection_01_shard3_replica4/, hostName=solr12.host.net:8983_solr, locations=solr12.host.net/10.35.74.212)
2016-03-25 15:29:14 INFO  SolrRDD:64 - Computing the partition 0 on host name localhost
2016-03-25 15:29:14 INFO  SolrRDD:69 - Using the shard url http://solr10.host.net:8983/solr/product_collection_01_shard1_replica4/ for getting partition data for split: ShardRDDPartition(0,*,SolrShard(shard1,List(SolrReplica(0) product_collection_01_shard1_replica1: url=http://solr01.host.net:8983/solr/product_collection_01_shard1_replica1/, hostName=solr01.host.net:8983_solr, locations=solr01.host.net/10.35.74.132, SolrReplica(0) product_collection_01_shard1_replica2: url=http://solr04.host.net:8983/solr/product_collection_01_shard1_replica2/, hostName=solr04.host.net:8983_solr, locations=solr04.host.net/10.35.74.195, SolrReplica(0) product_collection_01_shard1_replica3: url=http://solr07.host.net:8983/solr/product_collection_01_shard1_replica3/, hostName=solr07.host.net:8983_solr, locations=solr07.host.net/10.35.74.207, SolrReplica(0) product_collection_01_shard1_replica4: url=http://solr10.host.net:8983/solr/product_collection_01_shard1_replica4/, hostName=solr10.host.net:8983_solr, locations=solr10.host.net/10.35.74.210, SolrReplica(0) product_collection_01_shard1_replica5: url=http://solr13.host.net:8983/solr/product_collection_01_shard1_replica5/, hostName=solr13.host.net:8983_solr, locations=solr13.host.net/10.35.75.13)),q=*%3A*&rows=1000&fl=productId%2Ctitle&timeallowed=0&collection=product_collection_01&distrib=false&start=0&sort=productId+asc,SolrReplica(0) product_collection_01_shard1_replica4: url=http://solr10.host.net:8983/solr/product_collection_01_shard1_replica4/, hostName=solr10.host.net:8983_solr, locations=solr10.host.net/10.35.74.210)
2016-03-25 15:29:14 ERROR SolrQuerySupport:190 - Query [q=*%3A*&rows=1000&fl=productId%2Ctitle&timeallowed=0&collection=product_collection_01&distrib=false&start=0&sort=productId+asc&cursorMark=*] failed due to: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr02.host.net:8983/solr/product_collection_01_shard2_replica2: Can not search using both cursorMark and timeAllowed
2016-03-25 15:29:14 ERROR SolrQuerySupport:190 - Query [q=*%3A*&rows=1000&fl=productId%2Ctitle&timeallowed=0&collection=product_collection_01&distrib=false&start=0&sort=productId+asc&cursorMark=*] failed due to: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr12.host.net:8983/solr/product_collection_01_shard3_replica4: Can not search using both cursorMark and timeAllowed
2016-03-25 15:29:14 ERROR SolrQuerySupport:190 - Query [q=*%3A*&rows=1000&fl=productId%2Ctitle&timeallowed=0&collection=product_collection_01&distrib=false&start=0&sort=productId+asc&cursorMark=*] failed due to: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr10.host.net:8983/solr/product_collection_01_shard1_replica4: Can not search using both cursorMark and timeAllowed
2016-03-25 15:29:15 INFO  SolrRDD:80 - Fetched rows from shard http://solr02.host.net:8983/solr/product_collection_01_shard2_replica2/ for partition 1
2016-03-25 15:29:15 INFO  SolrRDD:80 - Fetched rows from shard http://solr10.host.net:8983/solr/product_collection_01_shard1_replica4/ for partition 0
2016-03-25 15:29:15 INFO  SolrRDD:80 - Fetched rows from shard http://solr12.host.net:8983/solr/product_collection_01_shard3_replica4/ for partition 2
2016-03-25 15:29:15 ERROR Executor:95 - Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.RuntimeException: org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr10.host.net:8983/solr/product_collection_01_shard1_replica4: Can not search using both cursorMark and timeAllowed
    at com.lucidworks.spark.query.StreamingResultsIterator.hasNext(StreamingResultsIterator.java:74)
    at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:88)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr10.host.net:8983/solr/product_collection_01_shard1_replica4: Can not search using both cursorMark and timeAllowed
    at com.lucidworks.spark.util.SolrQuerySupport$.querySolr(SolrQuerySupport.scala:217)
    at com.lucidworks.spark.util.SolrQuerySupport.querySolr(SolrQuerySupport.scala)
    at com.lucidworks.spark.query.StreamingResultsIterator.fetchNextPage(StreamingResultsIterator.java:99)
    at com.lucidworks.spark.query.StreamingResultsIterator.hasNext(StreamingResultsIterator.java:69)
    ... 20 more
Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr10.host.net:8983/solr/product_collection_01_shard1_replica4: Can not search using both cursorMark and timeAllowed
    at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:560)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:234)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:226)
    at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135)
    at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:152)
    at com.lucidworks.spark.util.SolrQuerySupport$.queryAndStreamResponsePost(SolrQuerySupport.scala:158)
    at com.lucidworks.spark.util.SolrQuerySupport$.querySolr(SolrQuerySupport.scala:184)
    ... 23 more
2016-03-25 15:29:15 ERROR Executor:95 - Exception in task 2.0 in stage 0.0 (TID 2)
java.lang.RuntimeException: org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr12.host.net:8983/solr/product_collection_01_shard3_replica4: Can not search using both cursorMark and timeAllowed
    at com.lucidworks.spark.query.StreamingResultsIterator.hasNext(StreamingResultsIterator.java:74)
    at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:88)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr12.host.net:8983/solr/product_collection_01_shard3_replica4: Can not search using both cursorMark and timeAllowed
    at com.lucidworks.spark.util.SolrQuerySupport$.querySolr(SolrQuerySupport.scala:217)
    at com.lucidworks.spark.util.SolrQuerySupport.querySolr(SolrQuerySupport.scala)
    at com.lucidworks.spark.query.StreamingResultsIterator.fetchNextPage(StreamingResultsIterator.java:99)
    at com.lucidworks.spark.query.StreamingResultsIterator.hasNext(StreamingResultsIterator.java:69)
    ... 20 more
Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr12.host.net:8983/solr/product_collection_01_shard3_replica4: Can not search using both cursorMark and timeAllowed
    at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:560)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:234)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:226)
    at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135)
    at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:152)
    at com.lucidworks.spark.util.SolrQuerySupport$.queryAndStreamResponsePost(SolrQuerySupport.scala:158)
    at com.lucidworks.spark.util.SolrQuerySupport$.querySolr(SolrQuerySupport.scala:184)
    ... 23 more
2016-03-25 15:29:15 ERROR Executor:95 - Exception in task 1.0 in stage 0.0 (TID 1)
java.lang.RuntimeException: org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr02.host.net:8983/solr/product_collection_01_shard2_replica2: Can not search using both cursorMark and timeAllowed
    at com.lucidworks.spark.query.StreamingResultsIterator.hasNext(StreamingResultsIterator.java:74)
    at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:88)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr02.host.net:8983/solr/product_collection_01_shard2_replica2: Can not search using both cursorMark and timeAllowed
    at com.lucidworks.spark.util.SolrQuerySupport$.querySolr(SolrQuerySupport.scala:217)
    at com.lucidworks.spark.util.SolrQuerySupport.querySolr(SolrQuerySupport.scala)
    at com.lucidworks.spark.query.StreamingResultsIterator.fetchNextPage(StreamingResultsIterator.java:99)
    at com.lucidworks.spark.query.StreamingResultsIterator.hasNext(StreamingResultsIterator.java:69)
    ... 20 more
Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr02.host.net:8983/solr/product_collection_01_shard2_replica2: Can not search using both cursorMark and timeAllowed
    at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:560)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:234)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:226)
    at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135)
    at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:152)
    at com.lucidworks.spark.util.SolrQuerySupport$.queryAndStreamResponsePost(SolrQuerySupport.scala:158)
    at com.lucidworks.spark.util.SolrQuerySupport$.querySolr(SolrQuerySupport.scala:184)
    ... 23 more
2016-03-25 15:29:15 WARN  TaskSetManager:70 - Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.RuntimeException: org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr10.host.net:8983/solr/product_collection_01_shard1_replica4: Can not search using both cursorMark and timeAllowed
    at com.lucidworks.spark.query.StreamingResultsIterator.hasNext(StreamingResultsIterator.java:74)
    at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:88)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr10.host.net:8983/solr/product_collection_01_shard1_replica4: Can not search using both cursorMark and timeAllowed
    at com.lucidworks.spark.util.SolrQuerySupport$.querySolr(SolrQuerySupport.scala:217)
    at com.lucidworks.spark.util.SolrQuerySupport.querySolr(SolrQuerySupport.scala)
    at com.lucidworks.spark.query.StreamingResultsIterator.fetchNextPage(StreamingResultsIterator.java:99)
    at com.lucidworks.spark.query.StreamingResultsIterator.hasNext(StreamingResultsIterator.java:69)
    ... 20 more
Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr10.host.net:8983/solr/product_collection_01_shard1_replica4: Can not search using both cursorMark and timeAllowed
    at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:560)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:234)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:226)
    at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135)
    at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:152)
    at com.lucidworks.spark.util.SolrQuerySupport$.queryAndStreamResponsePost(SolrQuerySupport.scala:158)
    at com.lucidworks.spark.util.SolrQuerySupport$.querySolr(SolrQuerySupport.scala:184)
    ... 23 more

2016-03-25 15:29:15 ERROR TaskSetManager:74 - Task 0 in stage 0.0 failed 1 times; aborting job
2016-03-25 15:29:15 INFO  TaskSchedulerImpl:58 - Removed TaskSet 0.0, whose tasks have all completed, from pool 
2016-03-25 15:29:15 WARN  TaskSetManager:70 - Lost task 1.0 in stage 0.0 (TID 1, localhost): java.lang.RuntimeException: org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr02.host.net:8983/solr/product_collection_01_shard2_replica2: Can not search using both cursorMark and timeAllowed
    at com.lucidworks.spark.query.StreamingResultsIterator.hasNext(StreamingResultsIterator.java:74)
    at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:88)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr02.host.net:8983/solr/product_collection_01_shard2_replica2: Can not search using both cursorMark and timeAllowed
    at com.lucidworks.spark.util.SolrQuerySupport$.querySolr(SolrQuerySupport.scala:217)
    at com.lucidworks.spark.util.SolrQuerySupport.querySolr(SolrQuerySupport.scala)
    at com.lucidworks.spark.query.StreamingResultsIterator.fetchNextPage(StreamingResultsIterator.java:99)
    at com.lucidworks.spark.query.StreamingResultsIterator.hasNext(StreamingResultsIterator.java:69)
    ... 20 more
Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr02.host.net:8983/solr/product_collection_01_shard2_replica2: Can not search using both cursorMark and timeAllowed
    at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:560)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:234)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:226)
    at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135)
    at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:152)
    at com.lucidworks.spark.util.SolrQuerySupport$.queryAndStreamResponsePost(SolrQuerySupport.scala:158)
    at com.lucidworks.spark.util.SolrQuerySupport$.querySolr(SolrQuerySupport.scala:184)
    ... 23 more

2016-03-25 15:29:15 INFO  TaskSchedulerImpl:58 - Removed TaskSet 0.0, whose tasks have all completed, from pool 
2016-03-25 15:29:15 WARN  TaskSetManager:70 - Lost task 2.0 in stage 0.0 (TID 2, localhost): java.lang.RuntimeException: org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr12.host.net:8983/solr/product_collection_01_shard3_replica4: Can not search using both cursorMark and timeAllowed
    at com.lucidworks.spark.query.StreamingResultsIterator.hasNext(StreamingResultsIterator.java:74)
    at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:88)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr12.host.net:8983/solr/product_collection_01_shard3_replica4: Can not search using both cursorMark and timeAllowed
    at com.lucidworks.spark.util.SolrQuerySupport$.querySolr(SolrQuerySupport.scala:217)
    at com.lucidworks.spark.util.SolrQuerySupport.querySolr(SolrQuerySupport.scala)
    at com.lucidworks.spark.query.StreamingResultsIterator.fetchNextPage(StreamingResultsIterator.java:99)
    at com.lucidworks.spark.query.StreamingResultsIterator.hasNext(StreamingResultsIterator.java:69)
    ... 20 more
Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr12.host.net:8983/solr/product_collection_01_shard3_replica4: Can not search using both cursorMark and timeAllowed
    at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:560)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:234)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:226)
    at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135)
    at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:152)
    at com.lucidworks.spark.util.SolrQuerySupport$.queryAndStreamResponsePost(SolrQuerySupport.scala:158)
    at com.lucidworks.spark.util.SolrQuerySupport$.querySolr(SolrQuerySupport.scala:184)
    ... 23 more

2016-03-25 15:29:15 INFO  TaskSchedulerImpl:58 - Removed TaskSet 0.0, whose tasks have all completed, from pool 
2016-03-25 15:29:15 INFO  TaskSchedulerImpl:58 - Cancelling stage 0
2016-03-25 15:29:15 INFO  DAGScheduler:58 - ShuffleMapStage 0 (collect at SolrExample.scala:25) failed in 0.190 s
2016-03-25 15:29:15 INFO  DAGScheduler:58 - Job 0 failed: collect at SolrExample.scala:25, took 0.468102 s
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.RuntimeException: org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr10.host.net:8983/solr/product_collection_01_shard1_replica4: Can not search using both cursorMark and timeAllowed
    at com.lucidworks.spark.query.StreamingResultsIterator.hasNext(StreamingResultsIterator.java:74)
    at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:88)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr10.host.net:8983/solr/product_collection_01_shard1_replica4: Can not search using both cursorMark and timeAllowed
    at com.lucidworks.spark.util.SolrQuerySupport$.querySolr(SolrQuerySupport.scala:217)
    at com.lucidworks.spark.util.SolrQuerySupport.querySolr(SolrQuerySupport.scala)
    at com.lucidworks.spark.query.StreamingResultsIterator.fetchNextPage(StreamingResultsIterator.java:99)
    at com.lucidworks.spark.query.StreamingResultsIterator.hasNext(StreamingResultsIterator.java:69)
    ... 20 more
Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr10.host.net:8983/solr/product_collection_01_shard1_replica4: Can not search using both cursorMark and timeAllowed
    at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:560)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:234)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:226)
    at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135)
    at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:152)
    at com.lucidworks.spark.util.SolrQuerySupport$.queryAndStreamResponsePost(SolrQuerySupport.scala:158)
    at com.lucidworks.spark.util.SolrQuerySupport$.querySolr(SolrQuerySupport.scala:184)
    ... 23 more

Driver stacktrace:
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
    at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
    at scala.Option.foreach(Option.scala:236)
    at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
    at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
    at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)
    at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:927)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
    at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
    at org.apache.spark.rdd.RDD.collect(RDD.scala:926)
    at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:166)
    at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:174)
    at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1538)
    at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1538)
    at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:56)
    at org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:2125)
    at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$execute$1(DataFrame.scala:1537)
    at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$collect$1.apply(DataFrame.scala:1542)
    at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$collect$1.apply(DataFrame.scala:1542)
    at org.apache.spark.sql.DataFrame.withCallback(DataFrame.scala:2138)
    at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$collect(DataFrame.scala:1542)
    at org.apache.spark.sql.DataFrame.collect(DataFrame.scala:1519)
    at example.SolrExample$.main(SolrExample.scala:25)
    at example.SolrExample.main(SolrExample.scala)
Caused by: java.lang.RuntimeException: org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr10.host.net:8983/solr/product_collection_01_shard1_replica4: Can not search using both cursorMark and timeAllowed
    at com.lucidworks.spark.query.StreamingResultsIterator.hasNext(StreamingResultsIterator.java:74)
    at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:88)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr10.host.net:8983/solr/product_collection_01_shard1_replica4: Can not search using both cursorMark and timeAllowed
    at com.lucidworks.spark.util.SolrQuerySupport$.querySolr(SolrQuerySupport.scala:217)
    at com.lucidworks.spark.util.SolrQuerySupport.querySolr(SolrQuerySupport.scala)
    at com.lucidworks.spark.query.StreamingResultsIterator.fetchNextPage(StreamingResultsIterator.java:99)
    at com.lucidworks.spark.query.StreamingResultsIterator.hasNext(StreamingResultsIterator.java:69)
    ... 20 more
Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://solr10.host.net:8983/solr/product_collection_01_shard1_replica4: Can not search using both cursorMark and timeAllowed
    at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:560)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:234)
    at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:226)
    at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135)
    at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:152)
    at com.lucidworks.spark.util.SolrQuerySupport$.queryAndStreamResponsePost(SolrQuerySupport.scala:158)
    at com.lucidworks.spark.util.SolrQuerySupport$.querySolr(SolrQuerySupport.scala:184)
    ... 23 more
2016-03-25 15:29:15 INFO  SparkContext:58 - Invoking stop() from shutdown hook
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/static/sql,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/SQL/execution/json,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/SQL/execution,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/SQL/json,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/SQL,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/api,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/static,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/executors/json,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/executors,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/environment/json,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/environment,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/storage/rdd,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/storage/json,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/storage,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/stages/pool,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/stages/stage,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/stages/json,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/stages,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/jobs/job,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/jobs/json,null}
2016-03-25 15:29:15 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/jobs,null}
2016-03-25 15:29:15 INFO  SparkUI:58 - Stopped Spark web UI at http://10.238.236.149:4040
2016-03-25 15:29:15 INFO  MapOutputTrackerMasterEndpoint:58 - MapOutputTrackerMasterEndpoint stopped!
2016-03-25 15:29:15 INFO  MemoryStore:58 - MemoryStore cleared
2016-03-25 15:29:15 INFO  BlockManager:58 - BlockManager stopped
2016-03-25 15:29:15 INFO  BlockManagerMaster:58 - BlockManagerMaster stopped
2016-03-25 15:29:15 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:58 - OutputCommitCoordinator stopped!
2016-03-25 15:29:15 INFO  SparkContext:58 - Successfully stopped SparkContext
2016-03-25 15:29:15 INFO  ShutdownHookManager:58 - Shutdown hook called
2016-03-25 15:29:15 INFO  ShutdownHookManager:58 - Deleting directory /private/var/folders/tm/g1rplzrd2qg33f2_k9tcn01438z5jv/T/spark-16583983-4d72-4131-8b38-93c3235e53ea
kiranchitturi commented 8 years ago

@hakanilter

It looks like keys in options are caseInsensitive and timeAllowed is being sent as timeallowed which is ignored by Solr parser.

I have added a fix fa28e83d5b4156393e24d60690ed67211ee86f2d for this.

Can you build from master, change your query and try again ?

     val options = Map(
            "zkHost" -> zkHosts,
            "collection" -> collection,
            "fields" -> fields,
            "solr.params" -> "timeAllowed=0"
          )
hakanilter commented 8 years ago

@kiranchitturi This commit has solved timeAllowed problem but it takes a long time to run the query and gives an another error this time:

2016-03-28 08:25:18 INFO  SparkContext:58 - Running Spark version 1.6.0
2016-03-28 08:25:19 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-03-28 08:25:19 WARN  Utils:70 - Your hostname, myhost resolves to a loopback address: 127.0.0.1; using 10.238.236.149 instead (on interface en0)
2016-03-28 08:25:19 WARN  Utils:70 - Set SPARK_LOCAL_IP if you need to bind to another address
2016-03-28 08:25:19 INFO  SecurityManager:58 - Changing view acls to: hakan
2016-03-28 08:25:19 INFO  SecurityManager:58 - Changing modify acls to: hakan
2016-03-28 08:25:19 INFO  SecurityManager:58 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hakan); users with modify permissions: Set(hakan)
2016-03-28 08:25:20 INFO  Utils:58 - Successfully started service 'sparkDriver' on port 50939.
2016-03-28 08:25:20 INFO  Slf4jLogger:80 - Slf4jLogger started
2016-03-28 08:25:20 INFO  Remoting:74 - Starting remoting
2016-03-28 08:25:20 INFO  Remoting:74 - Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@10.238.236.149:50940]
2016-03-28 08:25:20 INFO  Utils:58 - Successfully started service 'sparkDriverActorSystem' on port 50940.
2016-03-28 08:25:20 INFO  SparkEnv:58 - Registering MapOutputTracker
2016-03-28 08:25:20 INFO  SparkEnv:58 - Registering BlockManagerMaster
2016-03-28 08:25:20 INFO  DiskBlockManager:58 - Created local directory at /private/var/folders/tm/g1rplzrd2qg33f2_k9tcn01438z5jv/T/blockmgr-42f1a9e3-1585-4210-a5d3-90b714d0c563
2016-03-28 08:25:20 INFO  MemoryStore:58 - MemoryStore started with capacity 2.4 GB
2016-03-28 08:25:20 INFO  SparkEnv:58 - Registering OutputCommitCoordinator
2016-03-28 08:25:20 INFO  Server:272 - jetty-8.y.z-SNAPSHOT
2016-03-28 08:25:20 INFO  AbstractConnector:338 - Started SelectChannelConnector@0.0.0.0:4040
2016-03-28 08:25:20 INFO  Utils:58 - Successfully started service 'SparkUI' on port 4040.
2016-03-28 08:25:20 INFO  SparkUI:58 - Started SparkUI at http://10.238.236.149:4040
2016-03-28 08:25:21 INFO  Executor:58 - Starting executor ID driver on host localhost
2016-03-28 08:25:21 INFO  Utils:58 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 50941.
2016-03-28 08:25:21 INFO  NettyBlockTransferService:58 - Server created on 50941
2016-03-28 08:25:21 INFO  BlockManagerMaster:58 - Trying to register BlockManager
2016-03-28 08:25:21 INFO  BlockManagerMasterEndpoint:58 - Registering block manager localhost:50941 with 2.4 GB RAM, BlockManagerId(driver, localhost, 50941)
2016-03-28 08:25:21 INFO  BlockManagerMaster:58 - Registered BlockManager
2016-03-28 08:25:22 INFO  SolrZkClient:211 - Using default ZkCredentialsProvider
2016-03-28 08:25:22 INFO  ZooKeeper:100 - Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
2016-03-28 08:25:22 INFO  ZooKeeper:100 - Client environment:host.name=localhost
2016-03-28 08:25:22 INFO  ZooKeeper:100 - Client environment:java.version=1.7.0_75
2016-03-28 08:25:22 INFO  ZooKeeper:100 - Client environment:java.vendor=Oracle Corporation
2016-03-28 08:25:22 INFO  ZooKeeper:100 - Client environment:java.home=/Library/Java/JavaVirtualMachines/jdk1.7.0_75.jdk/Contents/Home/jre
2016-03-28 08:25:22 INFO  ZooKeeper:100 - Client environment:java.class.path=/Users/hakan/dev/workspace/git/spark-jobs/target/classes:/Users/hakan/dev/workspace/git/spark-jobs/target/test-classes:/Users/hakan/.m2/repository/org/apache/spark/spark-core_2.10/1.6.0/spark-core_2.10-1.6.0.jar:/Users/hakan/.m2/repository/org/apache/avro/avro-mapred/1.7.7/avro-mapred-1.7.7-hadoop2.jar:/Users/hakan/.m2/repository/org/apache/avro/avro-ipc/1.7.7/avro-ipc-1.7.7.jar:/Users/hakan/.m2/repository/org/apache/avro/avro-ipc/1.7.7/avro-ipc-1.7.7-tests.jar:/Users/hakan/.m2/repository/org/codehaus/jackson/jackson-core-asl/1.9.13/jackson-core-asl-1.9.13.jar:/Users/hakan/.m2/repository/com/twitter/chill_2.10/0.5.0/chill_2.10-0.5.0.jar:/Users/hakan/.m2/repository/com/esotericsoftware/kryo/kryo/2.21/kryo-2.21.jar:/Users/hakan/.m2/repository/com/esotericsoftware/reflectasm/reflectasm/1.07/reflectasm-1.07-shaded.jar:/Users/hakan/.m2/repository/com/esotericsoftware/minlog/minlog/1.2/minlog-1.2.jar:/Users/hakan/.m2/repository/org/objenesis/objenesis/1.2/objenesis-1.2.jar:/Users/hakan/.m2/repository/com/twitter/chill-java/0.5.0/chill-java-0.5.0.jar:/Users/hakan/.m2/repository/org/apache/xbean/xbean-asm5-shaded/4.4/xbean-asm5-shaded-4.4.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-launcher_2.10/1.6.0/spark-launcher_2.10-1.6.0.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-network-common_2.10/1.6.0/spark-network-common_2.10-1.6.0.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-network-shuffle_2.10/1.6.0/spark-network-shuffle_2.10-1.6.0.jar:/Users/hakan/.m2/repository/org/fusesource/leveldbjni/leveldbjni-all/1.8/leveldbjni-all-1.8.jar:/Users/hakan/.m2/repository/com/fasterxml/jackson/core/jackson-annotations/2.4.4/jackson-annotations-2.4.4.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-unsafe_2.10/1.6.0/spark-unsafe_2.10-1.6.0.jar:/Users/hakan/.m2/repository/net/java/dev/jets3t/jets3t/0.7.1/jets3t-0.7.1.jar:/Users/hakan/.m2/repository/org/apache/curator/curator-recipes/2.4.0/curator-recipes-2.4.0.jar:/Users/hakan/.m2/repository/org/apache/curator/curator-framework/2.4.0/curator-framework-2.4.0.jar:/Users/hakan/.m2/repository/com/google/guava/guava/14.0.1/guava-14.0.1.jar:/Users/hakan/.m2/repository/org/eclipse/jetty/orbit/javax.servlet/3.0.0.v201112011016/javax.servlet-3.0.0.v201112011016.jar:/Users/hakan/.m2/repository/org/apache/commons/commons-lang3/3.3.2/commons-lang3-3.3.2.jar:/Users/hakan/.m2/repository/org/apache/commons/commons-math3/3.4.1/commons-math3-3.4.1.jar:/Users/hakan/.m2/repository/com/google/code/findbugs/jsr305/1.3.9/jsr305-1.3.9.jar:/Users/hakan/.m2/repository/org/slf4j/jul-to-slf4j/1.7.10/jul-to-slf4j-1.7.10.jar:/Users/hakan/.m2/repository/org/slf4j/jcl-over-slf4j/1.7.10/jcl-over-slf4j-1.7.10.jar:/Users/hakan/.m2/repository/log4j/log4j/1.2.17/log4j-1.2.17.jar:/Users/hakan/.m2/repository/com/ning/compress-lzf/1.0.3/compress-lzf-1.0.3.jar:/Users/hakan/.m2/repository/net/jpountz/lz4/lz4/1.3.0/lz4-1.3.0.jar:/Users/hakan/.m2/repository/org/roaringbitmap/RoaringBitmap/0.5.11/RoaringBitmap-0.5.11.jar:/Users/hakan/.m2/repository/commons-net/commons-net/2.2/commons-net-2.2.jar:/Users/hakan/.m2/repository/com/typesafe/akka/akka-remote_2.10/2.3.11/akka-remote_2.10-2.3.11.jar:/Users/hakan/.m2/repository/com/typesafe/akka/akka-actor_2.10/2.3.11/akka-actor_2.10-2.3.11.jar:/Users/hakan/.m2/repository/com/typesafe/config/1.2.1/config-1.2.1.jar:/Users/hakan/.m2/repository/io/netty/netty/3.8.0.Final/netty-3.8.0.Final.jar:/Users/hakan/.m2/repository/com/google/protobuf/protobuf-java/2.5.0/protobuf-java-2.5.0.jar:/Users/hakan/.m2/repository/org/uncommons/maths/uncommons-maths/1.2.2a/uncommons-maths-1.2.2a.jar:/Users/hakan/.m2/repository/com/typesafe/akka/akka-slf4j_2.10/2.3.11/akka-slf4j_2.10-2.3.11.jar:/Users/hakan/.m2/repository/org/json4s/json4s-jackson_2.10/3.2.10/json4s-jackson_2.10-3.2.10.jar:/Users/hakan/.m2/repository/org/json4s/json4s-core_2.10/3.2.10/json4s-core_2.10-3.2.10.jar:/Users/hakan/.m2/repository/org/json4s/json4s-ast_2.10/3.2.10/json4s-ast_2.10-3.2.10.jar:/Users/hakan/.m2/repository/org/scala-lang/scalap/2.10.0/scalap-2.10.0.jar:/Users/hakan/.m2/repository/org/scala-lang/scala-compiler/2.10.0/scala-compiler-2.10.0.jar:/Users/hakan/.m2/repository/com/sun/jersey/jersey-server/1.9/jersey-server-1.9.jar:/Users/hakan/.m2/repository/asm/asm/3.1/asm-3.1.jar:/Users/hakan/.m2/repository/com/sun/jersey/jersey-core/1.9/jersey-core-1.9.jar:/Users/hakan/.m2/repository/org/apache/mesos/mesos/0.21.1/mesos-0.21.1-shaded-protobuf.jar:/Users/hakan/.m2/repository/io/netty/netty-all/4.0.29.Final/netty-all-4.0.29.Final.jar:/Users/hakan/.m2/repository/com/clearspring/analytics/stream/2.7.0/stream-2.7.0.jar:/Users/hakan/.m2/repository/io/dropwizard/metrics/metrics-core/3.1.2/metrics-core-3.1.2.jar:/Users/hakan/.m2/repository/io/dropwizard/metrics/metrics-jvm/3.1.2/metrics-jvm-3.1.2.jar:/Users/hakan/.m2/repository/io/dropwizard/metrics/metrics-json/3.1.2/metrics-json-3.1.2.jar:/Users/hakan/.m2/repository/io/dropwizard/metrics/metrics-graphite/3.1.2/metrics-graphite-3.1.2.jar:/Users/hakan/.m2/repository/com/fasterxml/jackson/core/jackson-databind/2.4.4/jackson-databind-2.4.4.jar:/Users/hakan/.m2/repository/com/fasterxml/jackson/core/jackson-core/2.4.4/jackson-core-2.4.4.jar:/Users/hakan/.m2/repository/com/fasterxml/jackson/module/jackson-module-scala_2.10/2.4.4/jackson-module-scala_2.10-2.4.4.jar:/Users/hakan/.m2/repository/org/scala-lang/scala-reflect/2.10.4/scala-reflect-2.10.4.jar:/Users/hakan/.m2/repository/com/thoughtworks/paranamer/paranamer/2.6/paranamer-2.6.jar:/Users/hakan/.m2/repository/org/apache/ivy/ivy/2.4.0/ivy-2.4.0.jar:/Users/hakan/.m2/repository/oro/oro/2.0.8/oro-2.0.8.jar:/Users/hakan/.m2/repository/org/tachyonproject/tachyon-client/0.8.2/tachyon-client-0.8.2.jar:/Users/hakan/.m2/repository/commons-io/commons-io/2.4/commons-io-2.4.jar:/Users/hakan/.m2/repository/org/tachyonproject/tachyon-underfs-hdfs/0.8.2/tachyon-underfs-hdfs-0.8.2.jar:/Users/hakan/.m2/repository/org/tachyonproject/tachyon-underfs-s3/0.8.2/tachyon-underfs-s3-0.8.2.jar:/Users/hakan/.m2/repository/org/tachyonproject/tachyon-underfs-local/0.8.2/tachyon-underfs-local-0.8.2.jar:/Users/hakan/.m2/repository/net/razorvine/pyrolite/4.9/pyrolite-4.9.jar:/Users/hakan/.m2/repository/net/sf/py4j/py4j/0.9/py4j-0.9.jar:/Users/hakan/.m2/repository/org/spark-project/spark/unused/1.0.0/unused-1.0.0.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-sql_2.10/1.6.0/spark-sql_2.10-1.6.0.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-catalyst_2.10/1.6.0/spark-catalyst_2.10-1.6.0.jar:/Users/hakan/.m2/repository/org/codehaus/janino/janino/2.7.8/janino-2.7.8.jar:/Users/hakan/.m2/repository/org/apache/parquet/parquet-column/1.7.0/parquet-column-1.7.0.jar:/Users/hakan/.m2/repository/org/apache/parquet/parquet-common/1.7.0/parquet-common-1.7.0.jar:/Users/hakan/.m2/repository/org/apache/parquet/parquet-encoding/1.7.0/parquet-encoding-1.7.0.jar:/Users/hakan/.m2/repository/org/apache/parquet/parquet-generator/1.7.0/parquet-generator-1.7.0.jar:/Users/hakan/.m2/repository/org/apache/parquet/parquet-hadoop/1.7.0/parquet-hadoop-1.7.0.jar:/Users/hakan/.m2/repository/org/apache/parquet/parquet-format/2.3.0-incubating/parquet-format-2.3.0-incubating.jar:/Users/hakan/.m2/repository/org/apache/parquet/parquet-jackson/1.7.0/parquet-jackson-1.7.0.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-hive_2.10/1.6.0/spark-hive_2.10-1.6.0.jar:/Users/hakan/.m2/repository/com/twitter/parquet-hadoop-bundle/1.6.0/parquet-hadoop-bundle-1.6.0.jar:/Users/hakan/.m2/repository/org/spark-project/hive/hive-exec/1.2.1.spark/hive-exec-1.2.1.spark.jar:/Users/hakan/.m2/repository/javolution/javolution/5.5.1/javolution-5.5.1.jar:/Users/hakan/.m2/repository/log4j/apache-log4j-extras/1.2.17/apache-log4j-extras-1.2.17.jar:/Users/hakan/.m2/repository/org/antlr/antlr-runtime/3.4/antlr-runtime-3.4.jar:/Users/hakan/.m2/repository/org/antlr/stringtemplate/3.2.1/stringtemplate-3.2.1.jar:/Users/hakan/.m2/repository/antlr/antlr/2.7.7/antlr-2.7.7.jar:/Users/hakan/.m2/repository/org/antlr/ST4/4.0.4/ST4-4.0.4.jar:/Users/hakan/.m2/repository/org/apache/commons/commons-compress/1.4.1/commons-compress-1.4.1.jar:/Users/hakan/.m2/repository/org/tukaani/xz/1.0/xz-1.0.jar:/Users/hakan/.m2/repository/org/codehaus/groovy/groovy-all/2.1.6/groovy-all-2.1.6.jar:/Users/hakan/.m2/repository/com/googlecode/javaewah/JavaEWAH/0.3.2/JavaEWAH-0.3.2.jar:/Users/hakan/.m2/repository/org/iq80/snappy/snappy/0.2/snappy-0.2.jar:/Users/hakan/.m2/repository/org/json/json/20090211/json-20090211.jar:/Users/hakan/.m2/repository/stax/stax-api/1.0.1/stax-api-1.0.1.jar:/Users/hakan/.m2/repository/net/sf/opencsv/opencsv/2.3/opencsv-2.3.jar:/Users/hakan/.m2/repository/jline/jline/2.12/jline-2.12.jar:/Users/hakan/.m2/repository/org/spark-project/hive/hive-metastore/1.2.1.spark/hive-metastore-1.2.1.spark.jar:/Users/hakan/.m2/repository/com/jolbox/bonecp/0.8.0.RELEASE/bonecp-0.8.0.RELEASE.jar:/Users/hakan/.m2/repository/commons-cli/commons-cli/1.2/commons-cli-1.2.jar:/Users/hakan/.m2/repository/commons-logging/commons-logging/1.1.3/commons-logging-1.1.3.jar:/Users/hakan/.m2/repository/org/apache/derby/derby/10.10.2.0/derby-10.10.2.0.jar:/Users/hakan/.m2/repository/org/datanucleus/datanucleus-api-jdo/3.2.6/datanucleus-api-jdo-3.2.6.jar:/Users/hakan/.m2/repository/org/datanucleus/datanucleus-rdbms/3.2.9/datanucleus-rdbms-3.2.9.jar:/Users/hakan/.m2/repository/commons-pool/commons-pool/1.5.4/commons-pool-1.5.4.jar:/Users/hakan/.m2/repository/commons-dbcp/commons-dbcp/1.4/commons-dbcp-1.4.jar:/Users/hakan/.m2/repository/javax/jdo/jdo-api/3.0.1/jdo-api-3.0.1.jar:/Users/hakan/.m2/repository/javax/transaction/jta/1.1/jta-1.1.jar:/Users/hakan/.m2/repository/org/apache/avro/avro/1.7.7/avro-1.7.7.jar:/Users/hakan/.m2/repository/commons-httpclient/commons-httpclient/3.1/commons-httpclient-3.1.jar:/Users/hakan/.m2/repository/org/apache/calcite/calcite-avatica/1.2.0-incubating/calcite-avatica-1.2.0-incubating.jar:/Users/hakan/.m2/repository/org/apache/calcite/calcite-core/1.2.0-incubating/calcite-core-1.2.0-incubating.jar:/Users/hakan/.m2/repository/org/apache/calcite/calcite-linq4j/1.2.0-incubating/calcite-linq4j-1.2.0-incubating.jar:/Users/hakan/.m2/repository/net/hydromatic/eigenbase-properties/1.1.5/eigenbase-properties-1.1.5.jar:/Users/hakan/.m2/repository/org/codehaus/janino/commons-compiler/2.7.6/commons-compiler-2.7.6.jar:/Users/hakan/.m2/repository/org/apache/httpcomponents/httpclient/4.3.2/httpclient-4.3.2.jar:/Users/hakan/.m2/repository/org/apache/httpcomponents/httpcore/4.3.1/httpcore-4.3.1.jar:/Users/hakan/.m2/repository/org/codehaus/jackson/jackson-mapper-asl/1.9.13/jackson-mapper-asl-1.9.13.jar:/Users/hakan/.m2/repository/commons-codec/commons-codec/1.10/commons-codec-1.10.jar:/Users/hakan/.m2/repository/joda-time/joda-time/2.9/joda-time-2.9.jar:/Users/hakan/.m2/repository/org/jodd/jodd-core/3.5.2/jodd-core-3.5.2.jar:/Users/hakan/.m2/repository/org/datanucleus/datanucleus-core/3.2.10/datanucleus-core-3.2.10.jar:/Users/hakan/.m2/repository/org/apache/thrift/libthrift/0.9.2/libthrift-0.9.2.jar:/Users/hakan/.m2/repository/org/apache/thrift/libfb303/0.9.2/libfb303-0.9.2.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-mllib_2.10/1.6.0/spark-mllib_2.10-1.6.0.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-streaming_2.10/1.6.0/spark-streaming_2.10-1.6.0.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-graphx_2.10/1.6.0/spark-graphx_2.10-1.6.0.jar:/Users/hakan/.m2/repository/com/github/fommil/netlib/core/1.1.2/core-1.1.2.jar:/Users/hakan/.m2/repository/net/sourceforge/f2j/arpack_combined_all/0.1/arpack_combined_all-0.1.jar:/Users/hakan/.m2/repository/org/scalanlp/breeze_2.10/0.11.2/breeze_2.10-0.11.2.jar:/Users/hakan/.m2/repository/org/scalanlp/breeze-macros_2.10/0.11.2/breeze-macros_2.10-0.11.2.jar:/Users/hakan/.m2/repository/org/scalamacros/quasiquotes_2.10/2.0.0-M8/quasiquotes_2.10-2.0.0-M8.jar:/Users/hakan/.m2/repository/com/github/rwl/jtransforms/2.4.0/jtransforms-2.4.0.jar:/Users/hakan/.m2/repository/org/spire-math/spire_2.10/0.7.4/spire_2.10-0.7.4.jar:/Users/hakan/.m2/repository/org/spire-math/spire-macros_2.10/0.7.4/spire-macros_2.10-0.7.4.jar:/Users/hakan/.m2/repository/org/jpmml/pmml-model/1.1.15/pmml-model-1.1.15.jar:/Users/hakan/.m2/repository/org/jpmml/pmml-agent/1.1.15/pmml-agent-1.1.15.jar:/Users/hakan/.m2/repository/org/jpmml/pmml-schema/1.1.15/pmml-schema-1.1.15.jar:/Users/hakan/.m2/repository/com/sun/xml/bind/jaxb-impl/2.2.7/jaxb-impl-2.2.7.jar:/Users/hakan/.m2/repository/com/sun/xml/bind/jaxb-core/2.2.7/jaxb-core-2.2.7.jar:/Users/hakan/.m2/repository/org/apache/spark/spark-streaming-kafka_2.10/1.6.0/spark-streaming-kafka_2.10-1.6.0.jar:/Users/hakan/.m2/repository/org/apache/kafka/kafka_2.10/0.8.2.1/kafka_2.10-0.8.2.1.jar:/Users/hakan/.m2/repository/com/yammer/metrics/metrics-core/2.2.0/metrics-core-2.2.0.jar:/Users/hakan/.m2/repository/org/apache/kafka/kafka-clients/0.8.2.1/kafka-clients-0.8.2.1.jar:/Users/hakan/.m2/repository/org/apache/zookeeper/zookeeper/3.4.6/zookeeper-3.4.6.jar:/Users/hakan/.m2/repository/net/sf/jopt-simple/jopt-simple/3.2/jopt-simple-3.2.jar:/Users/hakan/.m2/repository/com/101tec/zkclient/0.3/zkclient-0.3.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-client/2.5.0-cdh5.3.3/hadoop-client-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-common/2.5.0-cdh5.3.3/hadoop-common-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/xmlenc/xmlenc/0.52/xmlenc-0.52.jar:/Users/hakan/.m2/repository/commons-collections/commons-collections/3.2.1/commons-collections-3.2.1.jar:/Users/hakan/.m2/repository/commons-configuration/commons-configuration/1.6/commons-configuration-1.6.jar:/Users/hakan/.m2/repository/commons-digester/commons-digester/1.8/commons-digester-1.8.jar:/Users/hakan/.m2/repository/commons-beanutils/commons-beanutils/1.7.0/commons-beanutils-1.7.0.jar:/Users/hakan/.m2/repository/commons-beanutils/commons-beanutils-core/1.8.0/commons-beanutils-core-1.8.0.jar:/Users/hakan/.m2/repository/com/google/code/gson/gson/2.2.4/gson-2.2.4.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-auth/2.5.0-cdh5.3.3/hadoop-auth-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/org/apache/directory/server/apacheds-kerberos-codec/2.0.0-M15/apacheds-kerberos-codec-2.0.0-M15.jar:/Users/hakan/.m2/repository/org/apache/directory/server/apacheds-i18n/2.0.0-M15/apacheds-i18n-2.0.0-M15.jar:/Users/hakan/.m2/repository/org/apache/directory/api/api-asn1-api/1.0.0-M20/api-asn1-api-1.0.0-M20.jar:/Users/hakan/.m2/repository/org/apache/directory/api/api-util/1.0.0-M20/api-util-1.0.0-M20.jar:/Users/hakan/.m2/repository/org/apache/curator/curator-client/2.6.0/curator-client-2.6.0.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-hdfs/2.5.0-cdh5.3.3/hadoop-hdfs-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/org/mortbay/jetty/jetty-util/6.1.26.cloudera.4/jetty-util-6.1.26.cloudera.4.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-app/2.5.0-cdh5.3.3/hadoop-mapreduce-client-app-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-common/2.5.0-cdh5.3.3/hadoop-mapreduce-client-common-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-yarn-client/2.5.0-cdh5.3.3/hadoop-yarn-client-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/com/sun/jersey/jersey-client/1.9/jersey-client-1.9.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-yarn-server-common/2.5.0-cdh5.3.3/hadoop-yarn-server-common-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-shuffle/2.5.0-cdh5.3.3/hadoop-mapreduce-client-shuffle-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-yarn-api/2.5.0-cdh5.3.3/hadoop-yarn-api-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-core/2.5.0-cdh5.3.3/hadoop-mapreduce-client-core-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-yarn-common/2.5.0-cdh5.3.3/hadoop-yarn-common-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/javax/xml/bind/jaxb-api/2.2.2/jaxb-api-2.2.2.jar:/Users/hakan/.m2/repository/javax/xml/stream/stax-api/1.0-2/stax-api-1.0-2.jar:/Users/hakan/.m2/repository/javax/activation/activation/1.1/activation-1.1.jar:/Users/hakan/.m2/repository/javax/servlet/servlet-api/2.5/servlet-api-2.5.jar:/Users/hakan/.m2/repository/org/codehaus/jackson/jackson-jaxrs/1.8.8/jackson-jaxrs-1.8.8.jar:/Users/hakan/.m2/repository/org/codehaus/jackson/jackson-xc/1.8.8/jackson-xc-1.8.8.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-mapreduce-client-jobclient/2.5.0-cdh5.3.3/hadoop-mapreduce-client-jobclient-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-aws/2.5.0-cdh5.3.3/hadoop-aws-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/com/amazonaws/aws-java-sdk/1.7.4/aws-java-sdk-1.7.4.jar:/Users/hakan/.m2/repository/org/apache/hadoop/hadoop-annotations/2.5.0-cdh5.3.3/hadoop-annotations-2.5.0-cdh5.3.3.jar:/Users/hakan/.m2/repository/mysql/mysql-connector-java/5.1.34/mysql-connector-java-5.1.34.jar:/Users/hakan/.m2/repository/net/zemberek/zemberek-cekirdek/2.1.3.1/zemberek-cekirdek-2.1.3.1.jar:/Users/hakan/.m2/repository/net/zemberek/zemberek-tr/2.1.3/zemberek-tr-2.1.3.jar:/Users/hakan/.m2/repository/org/xerial/snappy/snappy-java/1.0.5/snappy-java-1.0.5.jar:/Users/hakan/.m2/repository/commons-lang/commons-lang/2.6/commons-lang-2.6.jar:/Users/hakan/.m2/repository/org/slf4j/slf4j-api/1.7.10/slf4j-api-1.7.10.jar:/Users/hakan/.m2/repository/org/slf4j/slf4j-log4j12/1.7.10/slf4j-log4j12-1.7.10.jar:/Users/hakan/.m2/repository/junit/junit/4.11/junit-4.11.jar:/Users/hakan/.m2/repository/org/hamcrest/hamcrest-core/1.3/hamcrest-core-1.3.jar:/Users/hakan/.m2/repository/org/specs2/specs2_2.10/1.13/specs2_2.10-1.13.jar:/Users/hakan/.m2/repository/org/specs2/scalaz-core_2.10/7.0.0/scalaz-core_2.10-7.0.0.jar:/Users/hakan/.m2/repository/org/specs2/scalaz-concurrent_2.10/7.0.0/scalaz-concurrent_2.10-7.0.0.jar:/Users/hakan/.m2/repository/org/specs2/scalaz-effect_2.10/7.0.0/scalaz-effect_2.10-7.0.0.jar:/Users/hakan/.m2/repository/org/scalatest/scalatest_2.10/2.0.M6-SNAP8/scalatest_2.10-2.0.M6-SNAP8.jar:/Users/hakan/dev/workspace/github/spark-solr/target/spark-solr-2.0.0-SNAPSHOT-shaded.jar
2016-03-28 08:25:22 INFO  ZooKeeper:100 - Client environment:java.library.path=/Users/hakan/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:.
2016-03-28 08:25:22 INFO  ZooKeeper:100 - Client environment:java.io.tmpdir=/var/folders/tm/g1rplzrd2qg33f2_k9tcn01438z5jv/T/
2016-03-28 08:25:22 INFO  ZooKeeper:100 - Client environment:java.compiler=<NA>
2016-03-28 08:25:22 INFO  ZooKeeper:100 - Client environment:os.name=Mac OS X
2016-03-28 08:25:22 INFO  ZooKeeper:100 - Client environment:os.arch=x86_64
2016-03-28 08:25:22 INFO  ZooKeeper:100 - Client environment:os.version=10.10.5
2016-03-28 08:25:22 INFO  ZooKeeper:100 - Client environment:user.name=hakan
2016-03-28 08:25:22 INFO  ZooKeeper:100 - Client environment:user.home=/Users/hakan
2016-03-28 08:25:22 INFO  ZooKeeper:100 - Client environment:user.dir=/Users/hakan/dev/workspace/git/spark-jobs
2016-03-28 08:25:22 INFO  ZooKeeper:438 - Initiating client connection, connectString=zookeeper04.host:2181,zookeeper05.host:2181,zookeeper06.host:2181 sessionTimeout=10000 watcher=org.apache.solr.common.cloud.SolrZkClient$3@1bfd290e
2016-03-28 08:25:22 INFO  ConnectionManager:193 - Waiting for client to connect to ZooKeeper
2016-03-28 08:25:22 INFO  ClientCnxn:975 - Opening socket connection to server 10.35.75.17/10.35.75.17:2181. Will not attempt to authenticate using SASL (unknown error)
2016-03-28 08:25:22 INFO  ClientCnxn:852 - Socket connection established to 10.35.75.17/10.35.75.17:2181, initiating session
2016-03-28 08:25:22 INFO  ClientCnxn:1235 - Session establishment complete on server 10.35.75.17/10.35.75.17:2181, sessionid = 0x15349f1bec3003e, negotiated timeout = 10000
2016-03-28 08:25:22 INFO  ConnectionManager:103 - Watcher org.apache.solr.common.cloud.ConnectionManager@28b454ef name:ZooKeeperConnection Watcher:zookeeper04.host:2181,zookeeper05.host:2181,zookeeper06.host:2181 got event WatchedEvent state:SyncConnected type:None path:null path:null type:None
2016-03-28 08:25:22 INFO  ConnectionManager:211 - Client is connected to ZooKeeper
2016-03-28 08:25:22 INFO  SolrZkClient:227 - Using default ZkACLProvider
2016-03-28 08:25:22 INFO  ZkStateReader:306 - Updating cluster state from ZooKeeper... 
2016-03-28 08:25:23 INFO  SolrRelation:121 - Constructed SolrQuery: q=*:*&rows=1000&fl=productId,title&timeAllowed=0&collection=product_collection_01
2016-03-28 08:25:23 INFO  SolrRDD:92 - Found 3 partitions: ShardRDDPartition(0,*,SolrShard(shard1,List(SolrReplica(0) product_collection_01_shard1_replica1: url=http://solr01.host:8983/solr/product_collection_01_shard1_replica1/, hostName=solr01.host:8983_solr, locations=solr01.host/10.35.74.132, SolrReplica(0) product_collection_01_shard1_replica2: url=http://solr04.host:8983/solr/product_collection_01_shard1_replica2/, hostName=solr04.host:8983_solr, locations=solr04.host/10.35.74.195, SolrReplica(0) product_collection_01_shard1_replica3: url=http://solr07.host:8983/solr/product_collection_01_shard1_replica3/, hostName=solr07.host:8983_solr, locations=solr07.host/10.35.74.207, SolrReplica(0) product_collection_01_shard1_replica4: url=http://solr10.host:8983/solr/product_collection_01_shard1_replica4/, hostName=solr10.host:8983_solr, locations=solr10.host/10.35.74.210, SolrReplica(0) product_collection_01_shard1_replica5: url=http://solr13.host:8983/solr/product_collection_01_shard1_replica5/, hostName=solr13.host:8983_solr, locations=solr13.host/10.35.75.13)),q=*:*&rows=1000&fl=productId,title&timeAllowed=0&collection=product_collection_01&distrib=false&start=0&sort=productId+asc,SolrReplica(0) product_collection_01_shard1_replica3: url=http://solr07.host:8983/solr/product_collection_01_shard1_replica3/, hostName=solr07.host:8983_solr, locations=solr07.host/10.35.74.207),ShardRDDPartition(1,*,SolrShard(shard2,List(SolrReplica(0) product_collection_01_shard2_replica2: url=http://solr02.host:8983/solr/product_collection_01_shard2_replica2/, hostName=solr02.host:8983_solr, locations=solr02.host/10.35.74.133, SolrReplica(0) product_collection_01_shard2_replica1: url=http://solr05.host:8983/solr/product_collection_01_shard2_replica1/, hostName=solr05.host:8983_solr, locations=solr05.host/10.35.74.196, SolrReplica(0) product_collection_01_shard2_replica3: url=http://solr08.host:8983/solr/product_collection_01_shard2_replica3/, hostName=solr08.host:8983_solr, locations=solr08.host/10.35.74.208, SolrReplica(0) product_collection_01_shard2_replica4: url=http://solr11.host:8983/solr/product_collection_01_shard2_replica4/, hostName=solr11.host:8983_solr, locations=solr11.host/10.35.74.211, SolrReplica(0) product_collection_01_shard2_replica5: url=http://solr14.host:8983/solr/product_collection_01_shard2_replica5/, hostName=solr14.host:8983_solr, locations=solr14.host/10.35.75.14, SolrReplica(0) product_collection_01_shard2_replica6: url=http://solr16.host:8983/solr/product_collection_01_shard2_replica6/, hostName=solr16.host:8983_solr, locations=solr16.host/10.35.75.16)),q=*:*&rows=1000&fl=productId,title&timeAllowed=0&collection=product_collection_01&distrib=false&start=0&sort=productId+asc,SolrReplica(0) product_collection_01_shard2_replica4: url=http://solr11.host:8983/solr/product_collection_01_shard2_replica4/, hostName=solr11.host:8983_solr, locations=solr11.host/10.35.74.211),ShardRDDPartition(2,*,SolrShard(shard3,List(SolrReplica(0) product_collection_01_shard3_replica2: url=http://solr03.host:8983/solr/product_collection_01_shard3_replica2/, hostName=solr03.host:8983_solr, locations=solr03.host/10.35.74.179, SolrReplica(0) product_collection_01_shard3_replica1: url=http://solr06.host:8983/solr/product_collection_01_shard3_replica1/, hostName=solr06.host:8983_solr, locations=solr06.host/10.35.74.197, SolrReplica(0) product_collection_01_shard3_replica3: url=http://solr09.host:8983/solr/product_collection_01_shard3_replica3/, hostName=solr09.host:8983_solr, locations=solr09.host/10.35.74.209, SolrReplica(0) product_collection_01_shard3_replica4: url=http://solr12.host:8983/solr/product_collection_01_shard3_replica4/, hostName=solr12.host:8983_solr, locations=solr12.host/10.35.74.212, SolrReplica(0) product_collection_01_shard3_replica5: url=http://solr15.host:8983/solr/product_collection_01_shard3_replica5/, hostName=solr15.host:8983_solr, locations=solr15.host/10.35.75.15)),q=*:*&rows=1000&fl=productId,title&timeAllowed=0&collection=product_collection_01&distrib=false&start=0&sort=productId+asc,SolrReplica(0) product_collection_01_shard3_replica5: url=http://solr15.host:8983/solr/product_collection_01_shard3_replica5/, hostName=solr15.host:8983_solr, locations=solr15.host/10.35.75.15)
2016-03-28 08:25:23 INFO  SparkContext:58 - Starting job: collect at SolrExample.scala:25
2016-03-28 08:25:23 INFO  DAGScheduler:58 - Registering RDD 5 (collect at SolrExample.scala:25)
2016-03-28 08:25:23 INFO  DAGScheduler:58 - Got job 0 (collect at SolrExample.scala:25) with 1 output partitions
2016-03-28 08:25:23 INFO  DAGScheduler:58 - Final stage: ResultStage 1 (collect at SolrExample.scala:25)
2016-03-28 08:25:23 INFO  DAGScheduler:58 - Parents of final stage: List(ShuffleMapStage 0)
2016-03-28 08:25:23 INFO  DAGScheduler:58 - Missing parents: List(ShuffleMapStage 0)
2016-03-28 08:25:23 INFO  DAGScheduler:58 - Submitting ShuffleMapStage 0 (MapPartitionsRDD[5] at collect at SolrExample.scala:25), which has no missing parents
2016-03-28 08:25:24 INFO  MemoryStore:58 - Block broadcast_0 stored as values in memory (estimated size 10.4 KB, free 10.4 KB)
2016-03-28 08:25:24 INFO  MemoryStore:58 - Block broadcast_0_piece0 stored as bytes in memory (estimated size 5.4 KB, free 15.8 KB)
2016-03-28 08:25:24 INFO  BlockManagerInfo:58 - Added broadcast_0_piece0 in memory on localhost:50941 (size: 5.4 KB, free: 2.4 GB)
2016-03-28 08:25:24 INFO  SparkContext:58 - Created broadcast 0 from broadcast at DAGScheduler.scala:1006
2016-03-28 08:25:24 INFO  DAGScheduler:58 - Submitting 3 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[5] at collect at SolrExample.scala:25)
2016-03-28 08:25:24 INFO  TaskSchedulerImpl:58 - Adding task set 0.0 with 3 tasks
2016-03-28 08:25:24 INFO  TaskSetManager:58 - Starting task 0.0 in stage 0.0 (TID 0, localhost, partition 0,PROCESS_LOCAL, 4595 bytes)
2016-03-28 08:25:24 INFO  TaskSetManager:58 - Starting task 1.0 in stage 0.0 (TID 1, localhost, partition 1,PROCESS_LOCAL, 4847 bytes)
2016-03-28 08:25:24 INFO  TaskSetManager:58 - Starting task 2.0 in stage 0.0 (TID 2, localhost, partition 2,PROCESS_LOCAL, 4595 bytes)
2016-03-28 08:25:24 INFO  Executor:58 - Running task 0.0 in stage 0.0 (TID 0)
2016-03-28 08:25:24 INFO  Executor:58 - Running task 1.0 in stage 0.0 (TID 1)
2016-03-28 08:25:24 INFO  Executor:58 - Running task 2.0 in stage 0.0 (TID 2)
2016-03-28 08:25:24 INFO  SolrRDD:61 - Computing the partition 1 on host name localhost
2016-03-28 08:25:24 INFO  SolrRDD:61 - Computing the partition 2 on host name localhost
2016-03-28 08:25:24 INFO  SolrRDD:61 - Computing the partition 0 on host name localhost
2016-03-28 08:25:24 INFO  SolrRDD:66 - Using the shard url http://solr11.host:8983/solr/product_collection_01_shard2_replica4/ for getting partition data for split: ShardRDDPartition(1,*,SolrShard(shard2,List(SolrReplica(0) product_collection_01_shard2_replica2: url=http://solr02.host:8983/solr/product_collection_01_shard2_replica2/, hostName=solr02.host:8983_solr, locations=solr02.host/10.35.74.133, SolrReplica(0) product_collection_01_shard2_replica1: url=http://solr05.host:8983/solr/product_collection_01_shard2_replica1/, hostName=solr05.host:8983_solr, locations=solr05.host/10.35.74.196, SolrReplica(0) product_collection_01_shard2_replica3: url=http://solr08.host:8983/solr/product_collection_01_shard2_replica3/, hostName=solr08.host:8983_solr, locations=solr08.host/10.35.74.208, SolrReplica(0) product_collection_01_shard2_replica4: url=http://solr11.host:8983/solr/product_collection_01_shard2_replica4/, hostName=solr11.host:8983_solr, locations=solr11.host/10.35.74.211, SolrReplica(0) product_collection_01_shard2_replica5: url=http://solr14.host:8983/solr/product_collection_01_shard2_replica5/, hostName=solr14.host:8983_solr, locations=solr14.host/10.35.75.14, SolrReplica(0) product_collection_01_shard2_replica6: url=http://solr16.host:8983/solr/product_collection_01_shard2_replica6/, hostName=solr16.host:8983_solr, locations=solr16.host/10.35.75.16)),q=*:*&rows=1000&fl=productId,title&timeAllowed=0&collection=product_collection_01&distrib=false&start=0&sort=productId+asc,SolrReplica(0) product_collection_01_shard2_replica4: url=http://solr11.host:8983/solr/product_collection_01_shard2_replica4/, hostName=solr11.host:8983_solr, locations=solr11.host/10.35.74.211)
2016-03-28 08:25:24 INFO  SolrRDD:66 - Using the shard url http://solr15.host:8983/solr/product_collection_01_shard3_replica5/ for getting partition data for split: ShardRDDPartition(2,*,SolrShard(shard3,List(SolrReplica(0) product_collection_01_shard3_replica2: url=http://solr03.host:8983/solr/product_collection_01_shard3_replica2/, hostName=solr03.host:8983_solr, locations=solr03.host/10.35.74.179, SolrReplica(0) product_collection_01_shard3_replica1: url=http://solr06.host:8983/solr/product_collection_01_shard3_replica1/, hostName=solr06.host:8983_solr, locations=solr06.host/10.35.74.197, SolrReplica(0) product_collection_01_shard3_replica3: url=http://solr09.host:8983/solr/product_collection_01_shard3_replica3/, hostName=solr09.host:8983_solr, locations=solr09.host/10.35.74.209, SolrReplica(0) product_collection_01_shard3_replica4: url=http://solr12.host:8983/solr/product_collection_01_shard3_replica4/, hostName=solr12.host:8983_solr, locations=solr12.host/10.35.74.212, SolrReplica(0) product_collection_01_shard3_replica5: url=http://solr15.host:8983/solr/product_collection_01_shard3_replica5/, hostName=solr15.host:8983_solr, locations=solr15.host/10.35.75.15)),q=*:*&rows=1000&fl=productId,title&timeAllowed=0&collection=product_collection_01&distrib=false&start=0&sort=productId+asc,SolrReplica(0) product_collection_01_shard3_replica5: url=http://solr15.host:8983/solr/product_collection_01_shard3_replica5/, hostName=solr15.host:8983_solr, locations=solr15.host/10.35.75.15)
2016-03-28 08:25:24 INFO  SolrRDD:66 - Using the shard url http://solr07.host:8983/solr/product_collection_01_shard1_replica3/ for getting partition data for split: ShardRDDPartition(0,*,SolrShard(shard1,List(SolrReplica(0) product_collection_01_shard1_replica1: url=http://solr01.host:8983/solr/product_collection_01_shard1_replica1/, hostName=solr01.host:8983_solr, locations=solr01.host/10.35.74.132, SolrReplica(0) product_collection_01_shard1_replica2: url=http://solr04.host:8983/solr/product_collection_01_shard1_replica2/, hostName=solr04.host:8983_solr, locations=solr04.host/10.35.74.195, SolrReplica(0) product_collection_01_shard1_replica3: url=http://solr07.host:8983/solr/product_collection_01_shard1_replica3/, hostName=solr07.host:8983_solr, locations=solr07.host/10.35.74.207, SolrReplica(0) product_collection_01_shard1_replica4: url=http://solr10.host:8983/solr/product_collection_01_shard1_replica4/, hostName=solr10.host:8983_solr, locations=solr10.host/10.35.74.210, SolrReplica(0) product_collection_01_shard1_replica5: url=http://solr13.host:8983/solr/product_collection_01_shard1_replica5/, hostName=solr13.host:8983_solr, locations=solr13.host/10.35.75.13)),q=*:*&rows=1000&fl=productId,title&timeAllowed=0&collection=product_collection_01&distrib=false&start=0&sort=productId+asc,SolrReplica(0) product_collection_01_shard1_replica3: url=http://solr07.host:8983/solr/product_collection_01_shard1_replica3/, hostName=solr07.host:8983_solr, locations=solr07.host/10.35.74.207)
2016-03-28 08:25:24 INFO  GenerateMutableProjection:58 - Code generated in 145.847 ms
2016-03-28 08:25:24 INFO  GenerateUnsafeProjection:58 - Code generated in 12.897 ms
2016-03-28 08:25:24 INFO  GenerateMutableProjection:58 - Code generated in 12.8 ms
2016-03-28 08:25:24 INFO  GenerateUnsafeRowJoiner:58 - Code generated in 9.83 ms
2016-03-28 08:25:24 INFO  GenerateUnsafeProjection:58 - Code generated in 8.874 ms
2016-03-28 08:33:57 INFO  ZkStateReader:994 - A live node change: WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/live_nodes, has occurred - updating... (live nodes size: 16)
2016-03-28 08:34:19 INFO  ZkStateReader:994 - A live node change: WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/live_nodes, has occurred - updating... (live nodes size: 15)
2016-03-28 08:34:51 INFO  ZkStateReader:994 - A live node change: WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/live_nodes, has occurred - updating... (live nodes size: 14)
2016-03-28 08:35:01 INFO  ZkStateReader:994 - A live node change: WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/live_nodes, has occurred - updating... (live nodes size: 15)
2016-03-28 08:35:13 INFO  ZkStateReader:994 - A live node change: WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/live_nodes, has occurred - updating... (live nodes size: 14)
2016-03-28 08:36:13 INFO  ZkStateReader:994 - A live node change: WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/live_nodes, has occurred - updating... (live nodes size: 15)
2016-03-28 08:37:05 INFO  ZkStateReader:994 - A live node change: WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/live_nodes, has occurred - updating... (live nodes size: 16)
2016-03-28 08:37:06 INFO  ZkStateReader:994 - A live node change: WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/live_nodes, has occurred - updating... (live nodes size: 15)
2016-03-28 08:41:44 INFO  SolrRDD:77 - Fetched rows from shard http://solr07.host:8983/solr/product_collection_01_shard1_replica3/ for partition 0
2016-03-28 08:41:44 WARN  TaskMemoryManager:368 - leak 32.3 MB memory from org.apache.spark.unsafe.map.BytesToBytesMap@3cdfd535
2016-03-28 08:41:44 ERROR Executor:74 - Managed memory leak detected; size = 33816576 bytes, TID = 0
2016-03-28 08:41:44 ERROR Executor:95 - Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.RuntimeException: No SolrDocument in queue (waited 60 seconds) while processing cursorMark=AoFQivG3Bg==, read 3657552 of 3657625 so far. Most likely this means your query's sort criteria is not generating stable results for computing deep-paging cursors, has the index changed? If so, try using a filter criteria the bounds the results to non-changing data.
    at com.lucidworks.spark.query.StreamingResultsIterator.next(StreamingResultsIterator.java:137)
    at com.lucidworks.spark.query.StreamingResultsIterator.next(StreamingResultsIterator.java:23)
    at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:42)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:505)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:686)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
2016-03-28 08:41:44 WARN  TaskSetManager:70 - Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.RuntimeException: No SolrDocument in queue (waited 60 seconds) while processing cursorMark=AoFQivG3Bg==, read 3657552 of 3657625 so far. Most likely this means your query's sort criteria is not generating stable results for computing deep-paging cursors, has the index changed? If so, try using a filter criteria the bounds the results to non-changing data.
    at com.lucidworks.spark.query.StreamingResultsIterator.next(StreamingResultsIterator.java:137)
    at com.lucidworks.spark.query.StreamingResultsIterator.next(StreamingResultsIterator.java:23)
    at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:42)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:505)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:686)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

2016-03-28 08:41:44 ERROR TaskSetManager:74 - Task 0 in stage 0.0 failed 1 times; aborting job
2016-03-28 08:41:44 INFO  TaskSchedulerImpl:58 - Cancelling stage 0
2016-03-28 08:41:44 INFO  Executor:58 - Executor is trying to kill task 1.0 in stage 0.0 (TID 1)
2016-03-28 08:41:44 INFO  TaskSchedulerImpl:58 - Stage 0 was cancelled
2016-03-28 08:41:44 INFO  Executor:58 - Executor is trying to kill task 2.0 in stage 0.0 (TID 2)
2016-03-28 08:41:44 INFO  DAGScheduler:58 - ShuffleMapStage 0 (collect at SolrExample.scala:25) failed in 980.122 s
2016-03-28 08:41:44 INFO  DAGScheduler:58 - Job 0 failed: collect at SolrExample.scala:25, took 980.421541 s
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.RuntimeException: No SolrDocument in queue (waited 60 seconds) while processing cursorMark=AoFQivG3Bg==, read 3657552 of 3657625 so far. Most likely this means your query's sort criteria is not generating stable results for computing deep-paging cursors, has the index changed? If so, try using a filter criteria the bounds the results to non-changing data.
    at com.lucidworks.spark.query.StreamingResultsIterator.next(StreamingResultsIterator.java:137)
    at com.lucidworks.spark.query.StreamingResultsIterator.next(StreamingResultsIterator.java:23)
    at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:42)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:505)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:686)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
    at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
    at scala.Option.foreach(Option.scala:236)
    at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
    at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
    at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)
    at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:927)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
    at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
    at org.apache.spark.rdd.RDD.collect(RDD.scala:926)
    at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:166)
    at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:174)
    at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1538)
    at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1538)
    at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:56)
    at org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:2125)
    at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$execute$1(DataFrame.scala:1537)
    at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$collect$1.apply(DataFrame.scala:1542)
    at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$collect$1.apply(DataFrame.scala:1542)
    at org.apache.spark.sql.DataFrame.withCallback(DataFrame.scala:2138)
    at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$collect(DataFrame.scala:1542)
    at org.apache.spark.sql.DataFrame.collect(DataFrame.scala:1519)
    at example.SolrExample$.main(SolrExample.scala:25)
    at example.SolrExample.main(SolrExample.scala)
Caused by: java.lang.RuntimeException: No SolrDocument in queue (waited 60 seconds) while processing cursorMark=AoFQivG3Bg==, read 3657552 of 3657625 so far. Most likely this means your query's sort criteria is not generating stable results for computing deep-paging cursors, has the index changed? If so, try using a filter criteria the bounds the results to non-changing data.
    at com.lucidworks.spark.query.StreamingResultsIterator.next(StreamingResultsIterator.java:137)
    at com.lucidworks.spark.query.StreamingResultsIterator.next(StreamingResultsIterator.java:23)
    at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:42)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:505)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:686)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95)
    at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
2016-03-28 08:41:44 INFO  SparkContext:58 - Invoking stop() from shutdown hook
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/static/sql,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/SQL/execution/json,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/SQL/execution,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/SQL/json,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/SQL,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/api,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/static,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/executors/json,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/executors,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/environment/json,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/environment,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/storage/rdd,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/storage/json,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/storage,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/stages/pool,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/stages/stage,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/stages/json,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/stages,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/jobs/job,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/jobs/json,null}
2016-03-28 08:41:44 INFO  ContextHandler:843 - stopped o.s.j.s.ServletContextHandler{/jobs,null}
2016-03-28 08:41:44 INFO  SparkUI:58 - Stopped Spark web UI at http://10.238.236.149:4040
2016-03-28 08:41:44 INFO  MapOutputTrackerMasterEndpoint:58 - MapOutputTrackerMasterEndpoint stopped!
2016-03-28 08:41:44 INFO  MemoryStore:58 - MemoryStore cleared
2016-03-28 08:41:44 INFO  BlockManager:58 - BlockManager stopped
2016-03-28 08:41:44 INFO  BlockManagerMaster:58 - BlockManagerMaster stopped
2016-03-28 08:41:44 INFO  OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:58 - OutputCommitCoordinator stopped!
2016-03-28 08:41:44 INFO  SparkContext:58 - Successfully stopped SparkContext
2016-03-28 08:41:44 INFO  ShutdownHookManager:58 - Shutdown hook called
2016-03-28 08:41:44 INFO  ShutdownHookManager:58 - Deleting directory /private/var/folders/tm/g1rplzrd2qg33f2_k9tcn01438z5jv/T/spark-1e1cf82f-5876-4b56-a3fd-302e11ff3e54
2016-03-28 08:41:44 INFO  RemoteActorRefProvider$RemotingTerminator:74 - Shutting down remote daemon.
kiranchitturi commented 8 years ago

@hakanilter Can you confirm something for me ?

There is an error message in the above logs that says

java.lang.RuntimeException: No SolrDocument in queue (waited 60 seconds) while processing cursorMark=AoFQivG3Bg==, read 3657552 of 3657625 so far. Most likely this means your query's sort criteria is not generating stable results for computing deep-paging cursors, has the index changed? If so, try using a filter criteria the bounds the results to non-changing data.

Can you confirm if indexing was happening during this spark query ?

Also, it looks like the failure happened 16 minutes after the query ?

It is very longer than any usual Spark query. Could you give me info on your Spark setup ? How much memory each worker is allocated ?

Since your collection is pretty big, it will be a good idea to use splits to parallelize queries rather than using shards. This will increase the number of partitions.

Example on how to use splits config:

    val options = Map(
            "zkHost" -> zkHosts,
            "collection" -> collection,
            "fields" -> fields,
            "solr.params" -> "timeAllowed=0",
            "splits" -> "true"
          )

There is also another param that controls how many splits to do per shard. https://github.com/lucidworks/spark-solr/blob/master/src/main/scala/com/lucidworks/spark/util/ConfigurationConstants.scala#L13. The default is 20.

Even without splits, 16 minutes is a very bad performance. Could you tell some info on the total number of docs, etc.. in your Solr collection ?

kiranchitturi commented 8 years ago

Can you build from recent master ? I have added a recent fix that logs the number of rows fetched from each shard

hakanilter commented 8 years ago

@kiranchitturi Everything works fine now. The last problem was my mistake because I was trying to run the query in my local :) Thank you for your help.

kiranchitturi commented 8 years ago

Awesome! Glad to hear it is working now.

On Tuesday, March 29, 2016, Hakan İlter notifications@github.com wrote:

@kiranchitturi https://github.com/kiranchitturi Everything works fine now. The last problem was my mistake because I was trying to run the query in my local :) Thank you for your help.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/lucidworks/spark-solr/issues/41#issuecomment-202747155

Kiran Chitturi

ranqiqiang commented 8 years ago

I have same error!

No SolrDocument in queue (waited 60 seconds) while processing cursorMark=AoE/ATAwMDQ2MzU1NTVjMDE3NjAwMTU1YzA1MjRiZmYwMjhh, read 40809 of 40822 so far from http://10.25.68.151:8080/solr/search4_thin_instancedetail_shard1_replica1. Most likely this means your query's sort criteria is not generating stable results for computing deep-paging cursors, has the index changed? If so, try using a filter criteria the bounds the results to non-changing data.

my spark-conf spark.cores.max=12 \ spark.executor.memory=1G \ spark.driver.memory=2G \ spark.scheduler.mode=FAIR \ spark.streaming.concurrentJobs=16 \ spark.streaming.receiver.maxRate=5000 \

solr parms: .set("splits_per_shard", "30"); .set("use_export_handler", "true");