worksofliam / blog

Blog
28 stars 5 forks source link

Storing and searching JSON in Db2 for i #33

Open worksofliam opened 5 years ago

worksofliam commented 5 years ago

I have recently been looking into storing unstructured data (JSON) into Db2. This is not only good for storing JSON, but if you want to store information that can expand and not have to worry about adding more columns - instead you can just add items to the JSON object in the row.

We're going to take a look at the following Db2 for i functions:

Document table

Our document table is very simple. Two columns, one unique auto-incrementing ID and another blob for the document. We will be storing our JSON objects as BLOB in Db2. Using JSON_TO_BSON will preparse the JSON document and store it in the binary format - better for performance when selectinf data.

create table barry.docs (
    did integer not null GENERATED ALWAYS AS IDENTITY (START WITH 1 INCREMENT BY 1),
    dcontent blob,
    PRIMARY KEY (did)
);

Inserting documents

Inserting documents is nice and simple. Since we have an auto-incrementing column we only need to insert a row with the document.

insert into barry.docs (dcontent) values(JSON_TO_BSON('{
    "_id": "5d52dff137b5a4c6c178545d",
    "index": 0,
    "balance": "$1,617.84",
    "picture": "http://placehold.it/32x32",
    "name": "Haley Parrish",
    "company": "MAXEMIA",
    "email": "haleyparrish@maxemia.com",
    "phone": "+1 (865) 403-2141",
    "address": "954 Bond Street, Albany, Iowa, 9076",
    "favoriteFruit": "apple"
  }'));

Notice here we are using JSON_TO_BSON and passing in a string parameter which is the JSON document. This takes our JSON document we're passing in and converts it to a blob format. When we use JSON_VALUE, we can pass in either a string JSON object or the blob column, which is great.

Since you probably want to test with a lot of data, here is an SQL script which will insert 5000 JSON documents into a docs table.

Selecting rows

To select all rows and get each JSON object out, we can use BSON_TO_JSON to convert our blob back to a JSON document. Note that this is slow because it will run on each row.

select BSON_TO_JSON(dcontent) from barry.docs

If you want to select a row based on a value from the JSON document, you can use the JSON_VALUE function. It takes two parameters: the JSON blob/string and the path to the value you want.

select BSON_TO_JSON(dcontent) from barry.docs where JSON_VALUE(dcontent, '$.index') = 3;

JSON_VALUE can also let the statement provide what type it's going to return - this will actually enhance performance later on (big time) when selecting specific values from the document.

--slower and also returns our index as a string
select did, JSON_VALUE(dcontent, '$.index') from barry.docs where JSON_VALUE(dcontent, '$.favoriteFruit' returning char(6)) = 'banana';

--faster
select did, JSON_VALUE(dcontent, '$.index' returning integer) from barry.docs where JSON_VALUE(dcontent, '$.favoriteFruit' returning char(6)) = 'banana';
--slower
SELECT 
    did, 
    JSON_VALUE(dcontent, '$.name') AS NAME, 
    JSON_VALUE(dcontent, '$.company') AS company, 
    JSON_VALUE(dcontent, '$.email') AS email
    FROM barry.docs
    WHERE 
        JSON_VALUE(dcontent, '$.favoriteFruit' returning CHAR(5)) = 'apple';

--faster defining some types
SELECT 
    did, 
    JSON_VALUE(dcontent, '$.name' returning CHAR(20)) AS NAME, 
    JSON_VALUE(dcontent, '$.company' returning CHAR(20)) AS company, 
    JSON_VALUE(dcontent, '$.email' returning CHAR(50)) AS email
    FROM barry.docs
    WHERE 
        JSON_VALUE(dcontent, '$.favoriteFruit' returning CHAR(5)) = 'apple';

In another post, I will write about updating the data and also creating some manual caches to index specific values in our objects to make them faster to query.

mikeb909 commented 8 months ago

I'm interested in some examples of updating data. Would you have some links or examples you could share?