satazor / js-spark-md5

Lightning fast normal and incremental md5 for javascript
Do What The F*ck You Want To Public License
2.46k stars 469 forks source link

Added a version that uses typed arrays #6

Closed stakach closed 11 years ago

stakach commented 11 years ago

readAsBinaryString is deprecated and IE10 doesn't have this function so I translated the code for use with typed arrays as I'm primarily using the code to hash files.

I left it as a separate file as it will only work in modern browsers - IE10, Chrome 9, Firefox 15, etc

satazor commented 11 years ago

Going to look at it in the next couple of days, thanks!

stakach commented 11 years ago

Cool. I optimised it for modern browsers as they are the only ones that can use typed arrays. Should be mostly compatible with your example code too, just change

fileReader.readAsBinaryString(blobSlice.call(file, start, end)); to fileReader.readAsArrayBuffer(blobSlice.call(file, start, end));

The other breaking change is the string functions hash the native UTF16 encoding vs UTF8.

Since the original version is no longer be viable for hashing files, maybe there should be two versions, one for files and one for strings. Just a thought.

satazor commented 11 years ago

@stakach I am surprised about the deprecation of readAsBinaryString. Still I think we should maintain the same file.

This could be solved by adding appendArrayBuffer and hashArrayBuffer (static method) and dealing with things inside (without slowing things down). Do you see any disadvantages of this approach?

I will work on it as soon as I get some free time. Meanwhile please keep using your fork.

stakach commented 11 years ago

I assume w3c got rid of binary strings as they were inefficient (using 2 bytes to represent a single byte)

My main hesitation for the single file was that I thought you may have had a use case for hashing UTF8 strings, where my only use case is hashing files which can only be done on modern browsers and typed arrays. Meaning less code, faster downloads and no support for anything less then IE10.

So if hashing UTF8 strings is a use case that you want to support, then 2 separate files is better - possibly with a different global variable name for anyone who wants to do both.

satazor commented 11 years ago

yes I do have use cases for utf8 strings, but those are the normal append() and hash() (static) functions.

stakach commented 11 years ago

So the main blocker preventing a single file is the charCodeAt() is now '[]' in the array buffer version.

satazor commented 11 years ago

Alright, thanks for letting me know about it! Will soon work on this.

satazor commented 11 years ago

@stakach thanks for the contribution!

I ended up adding another class for the array buffer version. Though I'm experiencing wrong opera results when using the array buffer (using binary strings are ok). This can easily be tested in the test/file_reader.html and test/file_reader_deprecated.html.

Do you have also this issue in your fork? Going to create a separate issue for this.