aplbrain / npyjs

Read numpy .npy files in JavaScript
https://aplbrain.github.io/npyjs/
Apache License 2.0
77 stars 21 forks source link

Various issues in code #40

Open bgutter opened 11 months ago

bgutter commented 11 months ago

Hello -- some things I noticed in the code which I think may be worth mentioning.

  1. Errors for headers larger than 255 bytes: In parse(), when reading the header length, a uint8 is being read from the DataView. It should actually be a uint16 in little-endian ordering. Existing code does not work for files with headers >255 bytes. Not a common situation but one that I've run into, especially when logging arrays of structures with named fields.

Fix would be replacing getUint8( offset ) with getUint16( offset, true ), where true indicates little-endian ordering.

  1. Errors when dtype description has more than one "(": When translating the header content from 'Python dict' format to 'JSON' format, there are three calls to String.replace() to substitute single quotes for double quotes, and, parentheses to square brackets. I'm guessing this is for py-tuple to js-array conversion. Two of the calls use regular expressions as the match argument, and one uses a string. Unfortunately, String.replace() does not behave consistently for these inputs. When the argument is a regular expression, String.replace() replaces all instances of the match argument. When the argument is a string, it replaces only the first. In this case, what happens is that ALL ")" characters are replaced with "]", but, only the FIRST "(" is converted to a "[". I suspect this is not intended.

  2. Errors for data segments not aligned to word boundaries: Various arrayConstructor function values expect data samples to be aligned to 4-byte word boundaries ("<i4", for example). When this is not the case, an error is thrown. If you want to read the data into arrays using these constructor functions, to accommodate situations where the first data address is not at a word boundary, the data should first be sliced into a new ArrayBuffer before being passed into the constructor.

j6k4m8 commented 11 months ago

Wow @bgutter thank you so much for the thoughtful comments and review — I noticed we had some dtype incompatibility but got sidetracked and wasn't able to address it before. Hope this hasn't been a showstopper for you!

Will triage and address these shortly!

BTW — if you happen to have shareable files that induced these errors, I'd love to add them to the test-suite! Or I am happy to debug locally with them and not push publicly...