danielkorzekwa / bayes-scala

Bayesian Networks in Scala
Other
205 stars 39 forks source link

Graphical visualization of Categoricals using Spark Notebook #23

Open nightscape opened 9 years ago

nightscape commented 9 years ago

Hi Daniel,

I've finally managed to create a visualization for Bayesian Networks constructed from Categoricals. Check out the README of the Gist here: https://gist.github.com/nightscape/c2fcccac859b3ae34c99#file-readme-md

Could you check if it runs on your machine? If so we can think about how to maybe integrate this into bayes-scala :)

Best and thanks again for your help! Martin

danielkorzekwa commented 9 years ago

error on cell 11: :17: error: not found: value dk import dk.bayes.dsl.infer

In [6]:

:local-repo /tmp/snb/repo

res10: String = Repo changed to /tmp/snb/repo!

Out[6]: Repo changed to /tmp/snb/repo! In [7]:

:remote-repo sonasnap % default % https://oss.sonatype.org/content/repositories/snapshots/

res11: String = Remote repo added: sonasnap % default % https://oss.sonatype.org/content/repositories/snapshots/!

Out[7]: Remote repo added: sonasnap % default % https://oss.sonatype.org/content/repositories/snapshots/! In [10]:

:dp com.github.danielkorzekwa % bayes-scala_2.11 % 0.5-SNAPSHOT

warning: there were 2 feature warning(s); re-run with -feature for details jars: Array[String] = [Ljava.lang.String;@4cdddd res20: List[String] = List(/tmp/snb/repo/com/googlecode/efficient-java-matrix-library/ejml/0.20/ejml-0.20.jar, /tmp/snb/repo/com/github/fommil/netlib/netlib-native_ref-linux-armhf/1.1/netlib-native_ref-linux-armhf-1.1-natives.jar, /tmp/snb/repo/net/sf/opencsv/opencsv/2.3/opencsv-2.3.jar, /tmp/snb/repo/com/github/fommil/netlib/netlib-native_system-linux-x86_64/1.1/netlib-native_system-linux-x86_64-1.1-natives.jar, /tmp/snb/repo/com/github/fommil/netlib/netlib-native_ref-win-i686/1.1/netlib-native_ref-win-i686-1.1-natives.jar, /tmp/snb/repo/org/spire-math/spire_2.11/0.7.4/spire_2.11-0.7.4.jar, /tmp/snb/repo/org/scalanlp/breeze_2.11/0.11.2/breeze_2.11-0.11.2.jar, /tmp/snb/repo/org/scalanlp/breeze-macros_2.11/0.11.2/breeze-macros_2.11-0.11.2....

Out[10]:

In [11]:

import notebook.front.third.d3._

import notebook., front., widgets._

import notebook.JsonCodec._

import play.api.libs.json._

import dk.bayes.dsl.variable.Categorical

import dk.bayes.dsl.infer

type CategoricalWithInfo = (String, Categorical, Seq[String])

val loadedCode = {

val source = scala.io.Source.fromURL(" https://gist.githubusercontent.com/nightscape/c2fcccac859b3ae34c99/raw/d3_bayesian_network.js ")

val res = source.mkString

source.close()

res

}

import play.api.libs.json._

import play.api.libs.functional.syntax._

import dk.bayes.dsl.variable.Categorical

import dk.bayes.dsl.infer

case class ConditionalProbabilityTable(node: BnNode, parents: Seq[BnNode], probabilities: Seq[Double])

case class BnNode(name: String, states: Seq[String], currentState: Option[String] = None)

case class BnEdge(source: Int, target: Int)

def categoricalsToNetwork(marginalizer: Categorical => Seq[Double], cptExtractor: Categorical => Seq[Double] = _.cpd)(categoricalsWithNames: Seq[CategoricalWithInfo]): (Seq[(BnNode, ConditionalProbabilityTable, ConditionalProbabilityTable)], Seq[BnEdge]) = {

import breeze.linalg._

import breeze.numerics._

val nodes = categoricalsWithNames.map { case(name, categorical, states) =>

val currentState = categorical.getValue().map(states.apply)

BnNode(name, states, currentState)

}

val categoricals = categoricalsWithNames.map(_._2)

val nodeMap = categoricals.zip(nodes).toMap

val cpts = categoricals.zip(nodes).map { case(cat, node) =>

val parents = cat.parents.map(nodeMap)

val numCols = node.states.size

val cpd = cptExtractor(cat)

val inferredCpd = infer(cat).cpd

val numRows =  cpd.size / numCols

val cptArray = cpd.toArray

val cpt = new DenseMatrix(numCols, numRows, cptArray).t

(ConditionalProbabilityTable(node, parents, cpt.toArray),

ConditionalProbabilityTable(node, Seq(), inferredCpd.toArray))

}

val edges = nodeMap.flatMap { case(cat, node) =>

val parents = cat.parents.map(nodeMap)

parents.map(p => BnEdge(nodes.indexOf(p), nodes.indexOf(node)))

}.toSeq

(nodes.zip(cpts).map { case(n, (c, m)) => (n, c, m)}, edges)

}

object ConditionalProbabilityTable {

implicit val conditionalProbabilityTableWrites: Writes[ConditionalProbabilityTable] = Json.writes[ConditionalProbabilityTable]

}

object BnNode {

implicit val nodeWrites: Writes[BnNode] = Json.writes[BnNode]

}

object BnEdge {

implicit val edgeWrites: Writes[BnEdge] = Json.writes[BnEdge]

}

def networkToJson(nodesWithCpts: Seq[(BnNode, ConditionalProbabilityTable, ConditionalProbabilityTable)], edges: Seq[BnEdge]): JsObject = {

val nodeJs = nodesWithCpts.map { case(node, cpt, marginalized) =>

Json.toJson(node).asInstanceOf[JsObject] + ("cpt", Json.toJson(cpt)) +

("marginalized", Json.toJson(marginalized))

}

Json.obj("nodes" -> Json.toJson(nodeJs), "edges" -> Json.toJson(edges))

}

val convertCategoricals: Seq[CategoricalWithInfo] => JsObject = (categoricalsToNetwork({cat: Categorical => infer(cat).cpd}) ).andThen((networkToJson ).tupled)

implicit val categoricalsCodec = new Codec[JsValue, Seq[CategoricalWithInfo]] {

def encode(x:JsValue):Seq[CategoricalWithInfo] = Seq()

def decode(x:Seq[CategoricalWithInfo]):JsValue = convertCategoricals(x)

}

val playgroundCode = s"""

function(dataPipe, e) {

$loadedCode

var bnGraph = BayesianNetworkGraph(e)

bnGraph(this.dataInit[0])

dataPipe.subscribe(function(d) {

bnGraph(d[0])

})

}

"""

()

:17: error: not found: value dk import dk.bayes.dsl.infer ^ :16: error: not found: value dk import dk.bayes.dsl.variable.Categorical ^ On 9 April 2015 at 21:44, Martin Mauch notifications@github.com wrote: > Hi Daniel, > > I've finally managed to create a visualization for Bayesian Networks > constructed from Categoricals. > Check out the README of the Gist here: > https://gist.github.com/nightscape/c2fcccac859b3ae34c99#file-readme-md > > Could you check if it runs on your machine? > If so we can think about how to maybe integrate this into bayes-scala :) > > Best and thanks again for your help! > Martin > > — > Reply to this email directly or view it on GitHub > https://github.com/danielkorzekwa/bayes-scala/issues/23. ## Daniel Korzekwa Machine Learning Engineer priv: https://www.linkedin.com/in/danielkorzekwa http://danmachine.com/ blog: http://blog.danmachine.com
nightscape commented 9 years ago

Ah, you're probably using the Scala 2.10 download of spark-notebook, right? Had the same problem and I think @andypetrella is fixing this as we speak :) In the meantime, you can use the Scala 2.11 download, that works for me.

andypetrella commented 9 years ago

@nightscape you're right man, @danielkorzekwa if you want you can also clone the current master branch and use it right away. I will probably release (0.4.1) it soon, but I want to be sure people noticing are happy with the current fixes :-D

danielkorzekwa commented 9 years ago

Works for me, graphs are generated.