stacycurl / pimpathon

Adds useful methods to scala & java classes.
Apache License 2.0
35 stars 9 forks source link

file.readBytes very inefficient #200

Closed fommil closed 9 years ago

fommil commented 9 years ago

rather a lot of objects created during this call. One would expect a more canonical Java approach, e.g. http://stackoverflow.com/questions/858980/file-to-byte-in-java

(but don't make this depend on Java 7)

stacycurl commented 9 years ago

My development philosophy is to initially write pimps as succinctly as possible because that helps drive out more pimps, but after that I'm completely ok with almost any implementation strategy to make them perform well.

Perhaps I should start benchmarking, like this: https://github.com/non/cats/commit/a88810ca4c47ad93aa409f815a1127f8a1a5745e

stacycurl commented 9 years ago

I've rewritten this in version 1.4.2, would you retest and tell me if it's ok ? To get 1.4.2 you'll need to add the bintray resolver, at some point (a month or so) I'll roll all 1.4.x changes into 1.5.0 and publish to sonatype as usual.

Do you still need 2.9 support, it's no big deal merging the changes but I'm having some trouble publishing, I think it's travis having to run 4 builds & publish 4 versions at once to bintray that giving me some grief, if I only need 2.10 & 2.11 then I might be able to use crossVersions or something and have only one build.

fommil commented 9 years ago

hmm, I can't use bintray at work unless there is a maven mirror.

fommil commented 9 years ago
raf.length().asInstanceOf[Int]

this might fail, there is an .intValue that doesn't involve casting.

fommil commented 9 years ago

This looks somewhat similar (albeit, backwards, to avoid creating vals)

 def readBytes(): Array[Byte] = new RandomAccessFile(file, "r").withFinally(_.close())(raf ⇒ {
      new Array[Byte](raf.length().asInstanceOf[Int]).tap(raf.read)
    })

to (the working for me)

    val ram = new RandomAccessFile(f, "r")
    val bytes = Array.ofDim[Byte](f.length.intValue())
    ram.read(bytes)
    new String(bytes, cs)

but your String methods always go through readLines which breaks the interpretation of unix/windows newline characters.

stacycurl commented 9 years ago

I interpreted that as: readBytes is fine but readString should use readBytes rather than readLines, is that so ?

Can you provide a failing test to lock in something that doesn't break the interpretation of unix/windows newline characters ?

On 1 March 2015 at 12:51, Sam Halliday notifications@github.com wrote:

This looks somewhat similar

def readBytes(): Array[Byte] = new RandomAccessFile(file, "r").withFinally(_.close())(raf ⇒ { new ArrayByte.tap(raf.read) })

to (the working for me)

val ram = new RandomAccessFile(f, "r")
val bytes = Array.ofDim[Byte](f.length.intValue())
ram.read(bytes)
new String(bytes, cs)

but your String methods always go through readLines which breaks the interpretation of unix/windows newline characters.

— Reply to this email directly or view it on GitHub https://github.com/stacycurl/pimpathon/issues/200#issuecomment-76595428.

Stacy

fommil commented 9 years ago

that's correct. I don't know how to create String instances with linefeeds in them... and git might actually convert them. Maybe create a raw Array[Byte] and write it out to file during the test?

fommil commented 9 years ago

hmm, there is "\r\n" that should produce a String with the windows encoding. That ought to work in the tests.

stacycurl commented 9 years ago

\r\n does force the new implementation, thanks.

Do you still need 2.9 support ?

fommil commented 9 years ago

If it's not too much trouble. We're releasing 1.0 at the end of the current sprint.

stacycurl commented 9 years ago

It's not much trouble.

On 2 March 2015 at 08:10, Sam Halliday notifications@github.com wrote:

If it's not too much trouble. We're releasing 1.0 at the end of the current sprint.

— Reply to this email directly or view it on GitHub https://github.com/stacycurl/pimpathon/issues/200#issuecomment-76672251.

Stacy

stacycurl commented 9 years ago

Sorry to badger you about this: Do you need both 2.9.2 & 2.9.3 ?

fommil commented 9 years ago

Yup, sorry

stacycurl commented 9 years ago

No worries.