Open cfstout opened 9 years ago
Thanks for pointing this out. When I get a chance I'll look into this on our side.
Also, another strange issue with this area of code-- we have an Index.db file that's 340MB, which is causing an OOM error in this section of code. We're actually working on the 2.0 WIP branch, so might be something to consider looking at for that support. Basically it seems like the issue is creating large arrays of Longs that are using up heap space. I don't know about SSTable particulars to know if there is any way around this though.
We have a working branch for 2.0.9 internally. The sstable format changed enough from 1.2 that it required us to change how we're parsing and reading the sstables. I'll make sure we have the latest committed here for others to use.
Please try the cassandra-2.0.x branch.
I have been using this code to create a MR job to run on AWS's elastic map reduce framework, and it seems that there might be a bug in the
readIndex(final FileSystem fileSystem, final Path sstablePath)
method. When we open the index using the nativeS3FileSystem, whenever we callinputStream.available()
the response is 0. I think the problem is due to the implementation of these inputStream objects, and not necessarily a problem with this repo's code itself. I have managed to fix the issue by moving the code into a while(true) loop and breaking on an EOFException, which though very hacky seems to work.I'm not sure if there is a better solution to the problem, or if it's really an artifact of a bug upstream, but thought I'd mention it here so others are aware.