Open lpatiny opened 3 years ago
I'm glad you find this repository helpful. I'll try to address your issue ASAP. You can watch the repo for new changes or star it.
Such big data might not be good for a web application. So if it is being used on backend then I'm not sure if this library is really a good choice for your project. I believe some library which works on stream would be a better choice. ArrayBuffer might not help completely.
Big data works pretty well in the browser for us. We process TIFF images of 1.5 Gb (electronic microscopy) in javascript in the browser without problems.
Indeed some libraries are working on stream but this one is faster this is why I was interested in this improvement.
Okay. To make it working perfectly for big data, we'll have to process streams. It is achievable but it complex the code and impact overall performance. I'm tagging it as a feature request.
We adapted the code to be suitable for our needs and parse directly a large ArrayBuffer or Uint8Array.
We had to change many things so that we could also parse a base64 encoded value (as a typedArray) to a Float64Array (and we still have a little bit of work on this).
Anyway the new parser is working and on my MacMini M1 I can parse a file of 1 Gb in 4.5s which is reasonable.
https://www.npmjs.com/package/arraybuffer-xml-parser
@amitguptagwl For me you may close this issue
It's nice to hear. I'm still keeping this issue open so it can be incorporated in future release.
In the project https://github.com/cheminfo/mzData we are using fast-xml-parser to parse scientific data (mass spectra).
Those data may be quite big and it works perfectly even with files of 400Mb.
However we may have files of 1Gb or more and there is currently a text size limitation (from javascript in Chrome) that is 512Mb.
I wonder if it would be possible to accept directly an ArrayBuffer and not only a text file. The current code uses nearly exclusively the array of chars so that most of the code could be compatible with ArrayBuffer but it would need to convert deal with multiple byte characters.