A basic Lua binding to simdjson. The simdjson library is an incredibly fast JSON parser that uses SIMD instructions and fancy algorithms to parse JSON very quickly. It's been tested with LuaJIT 2.0/2.1 and Lua 5.1, 5.2, 5.3, and 5.4 on linux/osx/windows. It has a general parsing mode and a lazy mode that uses a JSON pointer.
Current simdjson version: 3.10.1
If all the requirements are met, lua-simdjson can be install via luarocks with:
luarocks install lua-simdjson
Otherwise it can be installed manually by pulling the repo and running luarocks make.
There are two main ways to parse JSON in lua-simdjson:
parse
: this parses JSON and returns a Lua table with the parsed valuesopen
: this reads in the JSON and keeps it in simdjson's internal format. The values can then be accessed using a JSON pointer (examples below)Both of these methods also have support to read files on disc with parseFile
and openFile
respectively. If handling JSON from disk, these methods should be used and are incredibly fast.
simdjson.null
to represent null
values from parsed JSON.
lua_pushnumber
and lua_pushinteger
for JSON floats and ints respectively, so your Lua version may handle that slightly differently.
lua_pushinteger
uses signed ints. A number from JSON larger than LUA_MAXINTEGER
will be represented as a float/numberThe parse
methods will return a normal Lua table that can be interacted with.
local simdjson = require("simdjson")
local response = simdjson.parse([[
{
"Image": {
"Width": 800,
"Height": 600,
"Title": "View from 15th Floor",
"Thumbnail": {
"Url": "http://www.example.com/image/481989943",
"Height": 125,
"Width": 100
},
"Animated" : false,
"IDs": [116, 943, 234, 38793]
}
}
]])
print(response["Image"]["Width"])
-- OR to parse a file from disk
local fileResponse = simdjson.parseFile("jsonexamples/twitter.json")
print(fileResponse["statuses"][1]["id"])
The open
methods currently require the use of a JSON pointer, but are very quick. They are best used when you only need a part of a response. In the example below, it could be useful for just getting the Thumnail
object with :atPointer("/Image/Thumbnail")
which will then only create a Lua table with those specific values.
local simdjson = require("simdjson")
local response = simdjson.open([[
{
"Image": {
"Width": 800,
"Height": 600,
"Title": "View from 15th Floor",
"Thumbnail": {
"Url": "http://www.example.com/image/481989943",
"Height": 125,
"Width": 100
},
"Animated" : false,
"IDs": [116, 943, 234, 38793]
}
}
]])
print(response:atPointer("/Image/Width"))
-- OR to parse a file from disk
local fileResponse = simdjson.openFile("jsonexamples/twitter.json")
print(fileResponse:atPointer("/statuses/0/id")) --using a JSON pointer
Starting with version 0.2.0, the atPointer
method is JSON pointer compliant. The previous pointer implementation is considered deprecated, but is still available with the at
method.
The open
and parse
codeblocks should print out the same values. It's worth noting that the JSON pointer indexes from 0.
This lazy style of using the simdjson data structure could also be used with array access in the future.
lua-simdjson will error out with any errors from simdjson encountered while parsing. They are very good at helping identify what has gone wrong during parsing.
I ran some benchmarks against lua-cjson, rapidjson, and dkjson. For each test, I loaded the JSON into memory, and then had the parsers go through each file 100 times and took the average time it took to parse to a Lua table. You can see all the results in the benchmark folder. I've included a sample output run via Lua (the LuaJIT graph looks very similar, also in the benchmark folder). The y-axis is logarithmic, so every half step down is twice as fast.
I also calculated the throughput for each of the files to show how it may affect real-world performance. You can also find a LuaJIT version in the benchmarks folder
All tested files are in the jsonexamples folder.
lua-simdjson, like the simdjson library performs better on more modern hardware. These benchmarks were run on a ninth-gen i7 processor. On an older processor, rapidjson may perform better.
I plan to keep it fairly inline with what the original simdjson library is capable of doing, which really means not adding too many additional options. The big thing that's missing so far is encoding a lua table to JSON. I may add in an encoder at some point.