In your run, you just dumped the data as JSON. So @kordianbruck and me started to bulk load the data to the DB. We first loaded everything into a single table, import.final and a single column data. From there we wrote some teeny tiny SQL, exploiting Postres' JSON capabilities 🎉.
Maybe you want to have this somewhere under version control…
-- Drop everything that references old data
TRUNCATE TABLE websites, contains CASCADE;
-- Add id
ALTER TABLE import.final
ADD COLUMN id TEXT NOT NULL DEFAULT gen_random_uuid();
-- populate websites table
INSERT INTO websites
SELECT CAST(id AS UUID),
data->>'url' AS url,
data->>'bloburl' AS blob_url,
current_timestamp AS createdAt,
current_timestamp AS updatedAt
FROM import.final;
-- populate contains table
with occurences AS (
SELECT id, data->>'occurrences' AS occ
FROM import.final
), elems AS (
SELECT id, elems
FROM occurences, json_array_elements(cast(occ AS json)) AS elems
)
INSERT INTO contains
SELECT gen_random_uuid() AS id,
json_build_object(
'term', e.elems->'term'->'value',
'positions', e.elems->'positions'
) AS occurences,
current_timestamp AS createdAt,
current_timestamp AS updatedAt,
CAST(e.id AS UUID) AS websiteId,
CAST(e.elems->'term'->>'entityId' AS UUID) AS entityId
FROM elems e;
Hey @MusicConnectionMachine/group-2,
In your run, you just dumped the data as JSON. So @kordianbruck and me started to bulk load the data to the DB. We first loaded everything into a single table,
import.final
and a single columndata
. From there we wrote some teeny tiny SQL, exploiting Postres' JSON capabilities 🎉.Maybe you want to have this somewhere under version control…