Closed shulhi closed 10 years ago
Wow, this is great to know and thank you for the fix. I am bit surprised anyone is using the Hadoop code actually, so it's nice to know that it's still working after all the Hadoop API updates since we wrote it.
It is still working fine, although few warnings for not using the latest API. I'll try to update it to the latest when I have the time. Thanks again.
There are two bugs when running the Hadoop RI.
--inputFormat
is eitherTEXT
orSPARSE_TEXT
. Somehow, the buffer didn't get flushed even onclose()
. So, I manually flush the buffer everytime it is calling the write method. It is also not flushing the buffer when writing the header of file (called duringwriteEmptyHeader()
)you know nothing Jon Snow
, it writes the vector for all except the last wordSnow
. Not necessarily the last word in the sentence though, depends how it got sorted during mapper-reducer phase, but one of the words will definitely be missing when writing to file.Anyway, thanks for the great package!