In python I've been working with some json that has a lot of kanji in it and the parser will fail decoding it's internal buffer to utf8 if a read happen to split a kanji string in the wrong place. I've been using ijson because the json files can be quite large (2G) and the standard parser wants to read it all into memory which caused swapping and made my machine crawl but I may have use it if I can't figure out how to deal with this situation.
In python I've been working with some json that has a lot of kanji in it and the parser will fail decoding it's internal buffer to utf8 if a read happen to split a kanji string in the wrong place. I've been using ijson because the json files can be quite large (2G) and the standard parser wants to read it all into memory which caused swapping and made my machine crawl but I may have use it if I can't figure out how to deal with this situation.