Closed nmadhire closed 9 years ago
Organized the code little bit to make it more general. Let me know if you see something more can be improved.
It looks much better after your changes. Im willing to let this PR merged.
It would be good as part of the next batch of work to abstract how things are parsed from ComputeStats
.
In this case countURI
has to know about link.ids
which is inherently related to the way jsonpedia is structured.
It would be good if the parser exposes certain interfaces, such that all of this internals are not required to be known by ComputeStats
.
for example counting uris could look like similar to:
def countURI(){
parser.getURIS().map(..function with count magic..)
}
This is good to go now. I will change the ComputeStats logic in the next PR tomorrow.
Pull Request for reviewing Entity counts using Apache Spark and Scala.