twitter / cassovary

Cassovary is a simple big graph processing library for the JVM
http://twitter.com/cassovary
Apache License 2.0
1.05k stars 150 forks source link

Remove the unnecessary "() =>" in `def iteratorSeq` #137

Closed AnishShah closed 9 years ago

AnishShah commented 9 years ago

I made an abstract class as @szymonm said.

abstract class NodeStream extends Iterable[NodeIdEdgesMaxId] with Closeable {..}

Then, I used it in AdjacencyListGraphReader.scala & ListOfEdgesGraphReader.scala.

I implemened the close() method to close the file.

Is this correct? I haven't removed "() => " in def iteratorSeq. I will do it after reviews on this.

AnishShah commented 9 years ago

@pankajgupta Thanks, I will fix this in sometime.

AnishShah commented 9 years ago

@pankajgupta @szymonm - Please check the last commit. I have removed () => from GraphReader. On Line 65 & 74

iteratorSeq.map(iter => {() => iter.iterator}).toSeq

Here, iter.iterator is returning empty. I haven't removed () => from all the files. But this should still work, right? I don't know why it is returning empty.

AnishShah commented 9 years ago

@pankajgupta @szymonm

I think there's a bug in SharedArrayBasedDirectedGraph.scala because of which this is failing.

The problem is, we are reading the same iterator twice (first in readNumOfEdgesAndMaxId, and second in readOutEdges) .

For eg:

scala> val list = List(1, 2)
list: List[Int] = List(1, 2)

scala> val it = list.iterator
it: Iterator[Int] = non-empty iterator

scala> val it1 = it
it1: Iterator[Int] = non-empty iterator

scala> val it2 = it
it2: Iterator[Int] = non-empty iterator

scala> it1.next()
res2: Int = 1

scala> it1.next()
res3: Int = 2

scala> it1.next()
java.util.NoSuchElementException: next on empty iterator
  at scala.collection.Iterator$$anon$2.next(Iterator.scala:39)
  at scala.collection.Iterator$$anon$2.next(Iterator.scala:37)
  at scala.collection.LinearSeqLike$$anon$1.next(LinearSeqLike.scala:47)
  ... 43 elided

scala> it2.next()
java.util.NoSuchElementException: next on empty iterator
  at scala.collection.Iterator$$anon$2.next(Iterator.scala:39)
  at scala.collection.Iterator$$anon$2.next(Iterator.scala:37)
  at scala.collection.LinearSeqLike$$anon$1.next(LinearSeqLike.scala:47)
  ... 43 elided

I haven't made any changes in SharedArrayBasedDirectedGraph.scala and when I added log.info() after readNumOfEdgesAndMaxId and it said iteratorFunc as empty iterator. Please correct me if I'm wrong. (I don't know how it was working properly till now.)

pankajgupta commented 9 years ago

Oh I see -- yes SharedArrayBasedDirectedGraph is reading it twice. Earlier the () => Iterator type gave a new iterator both times. So what we need to do is to change Seq[() => Iterator] to Seq[NodeStream] and replace the iteratorFunc() to iteratorFunc.iterator

AnishShah commented 9 years ago

There's another problem then.. If we change Seq[() => Iterator] to Seq[NodeStream], then in TestGraph.scala, we are creating graphs using NodeIdEdgesMaxId. But, we can't create using NodeStream because it is just a trait.

pankajgupta commented 9 years ago

Aah, we missed that earlier. Actually, seems to me that it will be better to delete NodeStream, go back to NodeIdEdgesMaxId and change Seq[() => Iterator] to Seq[Iterable[NodeIdEdgesMaxId]].

The close() should be handled by the files providing implementation of OneShardReader that actually use and read from Source -- i.e., in ListOfEdgesGraphReader and AdjacencyListGraphReader by closing the file they have opened.

@szymonm since you had suggested the NodeStream extend Closeable, is this ok by you? doesn't seem like NodeStream needs to be provided only by a file source.

AnishShah commented 9 years ago

I'm not able to understand why it is still giving empty iterator even though we are using Iterable[NodeIdEdgesMaxId] in SharedArrayBasedDirectedGraph.scala. Any help? @pankajgupta @szymonm

AnishShah commented 9 years ago

Thanks @pankajgupta!

AnishShah commented 9 years ago

I guess, Something is wrong with Travis.

$ sbt ++$TRAVIS_SCALA_VERSION test
Detected sbt version 0.13.7
Downloading sbt launcher for 0.13.7:
  From  http://typesafe.artifactoryonline.com/typesafe/ivy-releases/org.scala-sbt/sbt-launch/0.13.7/sbt-launch.jar
    To  /home/travis/.sbt/launchers/0.13.7/sbt-launch.jar
Download failed. Obtain the jar manually and place it at /home/travis/.sbt/launchers/0.13.7/sbt-launch.jar

The command "sbt ++$TRAVIS_SCALA_VERSION test" exited with 1.

Done. Your build exited with 1.
AnishShah commented 9 years ago

Sorry for closing & reopening PR twice. Something was wrong with Travis.

All benchmarks and examples works except RandomWalkJava. Though the error was not because of my changes, I have corrected it.

This should be ready for merge, I guess. Tell me if any changes are required.