Open Polydynamical opened 3 years ago
Can you send the link of that module here? I'm curious to see how it works. Maybe I can implement that in x.json2
It's basically a JSON parser that reads from a stream and parses as it reads, instead of reading the whole file in and parsing it all at once.
Definitely something we need. Can use io.Reader
interface for buffered input.
@nedpals https://pypi.org/project/ijson/
@JalonSolov Yes, exactly. I use JSON files for NLP with Python but Python is slow. I am planning on transitioning it to V and this would be a great module to add.
There are many parser & readers APIs in V which should be IMHO "iterative" by default (I call them "lazy views" with zero-copy semantics) as discussed e.g. in https://github.com/vlang/v/issues/1732#issuecomment-527823969 and other places (e.g. http response parsing incl. cookies etc.).
Unfortunately this "zero copy" (or "lazy parsing/reading" or "fragment parsing/reading") paradigm is not known much and people just disregard it to be more complicated (it's actually only slightly more complicated so this argument is a straw one), but the benefits are enormous in practise - that's why e.g. the gumbo HTML5 parser became so popular.
Another real-life example how to do "lazy/streaming parsing" well and at scale: https://github.com/cloudflare/lol-html (from Cloudflare).
The RapidJSON page about SAX and iterative parsing gave me an idea on how I would implement this feature but I'll try to make it adaptable as much as possible to the current parser.
Once you get it working it should likely be the default. Although keeping the current parsing as an alternative could be handy for when the size is known... especially when it is known to be small.
In Python, there is a module "ijson" to parse giant JSON files. Is it possible to do the same with V?