twitter / cassovary

Cassovary is a simple big graph processing library for the JVM
http://twitter.com/cassovary
Apache License 2.0
1.05k stars 150 forks source link

Remove the unnecessary "() =>" in `def iteratorSeq` from com.twitter.cassovary.util.io.GraphReader #136

Closed pankajgupta closed 9 years ago

pankajgupta commented 9 years ago

It doesn't seem to me that this is necessary. The original idea I guess was to ensure that the files are read during graph construction and not upfront. But even if we just have Seq[Iterator[NodeIdEdgesMaxId]] only the iterators are constructed and not actually traversed. Hence it seems like the extra empty parentheses are un-necessary.

Would you agree @szymonm ?

szymonm commented 9 years ago

Maybe the intention was to open the file on call? Have no idea what else could that be.

It is also puzzling that we don't need to close the file.

Read 2nd answer from: http://stackoverflow.com/questions/4458864/whats-the-right-way-to-use-scala-io-source and it turns out that we should actually avoid scala's Source.fromFile()...

pankajgupta commented 9 years ago

Aha -- not closing the file is definitely a bug that we should fix.

I wouldn't go to the extent to completely avoid io.Source but yes should take care of exceptions

But this will need to be modified too in AdjacencyListGraphReader as this will be evaluated as soon as the class is created: val (edgesBySource, nodeMaxOutEdgeId) = readEdgesBySource()

szymonm commented 9 years ago

That is true, but () => Iterator is still weird. IMO Iterable looks better than this, if some initialization is needed.

I would like to see there sth like

def iteratorSeq: Iterable[NodeStream]

where

NodeStream extends Iterable[NodeIdEdgesMaxId] with Closeable {..}
pankajgupta commented 9 years ago

Agreed Iterable is good.

AnishShah commented 9 years ago

I'm working on this. I did as @szymonm said. Then, do I also have to change Seq[() => Iterator[NodeIdEdgesMaxId]] to Iterable[NodeStream] ?

AnishShah commented 9 years ago

Sorry. I mean, do I have to change Seq[() => Iterator[NodeIdEdgesMaxId]] to Iterable[NodeStream] in ArrayBasedDirectedGraph and SharedArrayBasedDirectedGraph ?

szymonm commented 9 years ago

Pls squash the commits, @AnishShah when this will be ready to merge.

AnishShah commented 9 years ago

@szymonm Done.