aalkilani / spark-kafka-cassandra-applying-lambda-architecture

Other
64 stars 52 forks source link

Chapter 5 - Advanced Streaming Operations: Evaluating Approximation Performance with Zeppelin: Demo #29

Closed mateos-alliaj closed 6 years ago

mateos-alliaj commented 6 years ago

Hi,

I added the dependency "com.twitter:algebird-core_2.11:0.11.0" for using HyperLogLog. After saving that new dependency and switching to the notebook I got an error by running the paragraph:

 import com.twitter.algebird.{HyperLogLogMonoid, HLL}
  import org.apache.spark.streaming.State

  // serializable objects through adding 'case'
  case object functions {
      def mapVisitorsStateFunc = (k: (String, Long), v: Option[HLL], state: State[HLL]) => {
        val currentVisitorHLL = state.getOption().getOrElse(new HyperLogLogMonoid(12).zero)
        val newVisitorHLL = v match {
          case Some(visitorHLL) => currentVisitorHLL + visitorHLL
          case None => currentVisitorHLL
        }
        state.update(newVisitorHLL)
        val output = newVisitorHLL.approximateSize.estimate
        output
      }
  }
:28: error: object algebird is not a member of package com.twitter import com.twitter.algebird.{HyperLogLogMonoid, HLL}

Please help me to solve this problem. Thanks!

mateos-alliaj commented 6 years ago

Hi, I found the solution to this problem in the closed issue "Error when Create Kafka Receiver on Zeppelin #10". So you have to use "com.twitter:algebird-core_2.11:0.12.3" instead of "com.twitter:algebird-core_2.11:0.11.0".

Thank you for the solution Ahmad!