gbv / subjects-api

JSKOS Concept Occurrences Provider implementation
https://coli-conc.gbv.de/subjects/
MIT License
0 stars 0 forks source link

Include information about items and libraries #29

Open nichtich opened 2 years ago

nichtich commented 2 years ago

PostgresSQL is not only relevant for performance analysis (#26) but also allows to extend the database e.g. with information about libraries.

-- technically PPN is an integer with checksum so more can be improved
CREATE DOMAIN ppn AS TEXT CHECK (VALUE ~* '^[0-9]+[0-9X]$');

CREATE TABLE IF NOT EXISTS Vocabulary (
  key text NOT NULL,
  jskos json NOT NULL DEFAULT '{}'::json,
  PRIMARY KEY (key),
  CONSTRAINT valid_key CHECK (key ~* '^[a-z]+$')
);

CREATE TABLE IF NOT EXISTS Title (
  ppn ppn NOT NULL,
  PRIMARY KEY (ppn)
);

-- By now this is the only table needed for occurrences-api
CREATE TABLE IF NOT EXISTS Subject (
  ppn ppn NOT NULL,
  voc text NOT NULL,
  notation text NOT NULL,
  FOREIGN KEY (voc) REFERENCES Vocabulary (key),
  FOREIGN KEY (ppn) REFERENCES Title (ppn)
);

CREATE TABLE IF NOT EXISTS Library (
  iln smallint NOT NULL,
  PRIMARY KEY (iln)
);

CREATE TABLE IF NOT EXISTS Item (
  epn int NOT NULL,
  ppn text NOT NULL,
  iln smallint NOT NULL,
  PRIMARY KEY (epn),
  FOREIGN KEY (ppn) REFERENCES Title (ppn),
  FOREIGN KEY (iln) REFERENCES Library (iln)
);

-- allows to insert new ppn but keep foreign key on title.ppn
CREATE OR REPLACE FUNCTION create_title_if_missing() RETURNS TRIGGER AS '
begin
  if not exists(select ppn from Title where ppn=new.ppn) then
    insert into Title values (new.ppn);
  end if;
  return new;
end;  
' language plpgsql;

DROP TRIGGER IF EXISTS subject_unknonw_ppn ON SUBJECT;
create trigger subject_unknonw_ppn
before insert or update on Subject
for each row execute procedure create_title_if_missing();

Adding information about libraries would allow to limit query to titles held by a specific library.

nichtich commented 2 years ago

Concept hierarchy and mapping inference could be added as well to support queries for broad topic areas (e.g. everything in physics) by using PosgreSQL feature WITH RECURSIVE and MATERIALIZED VIEW or possibly later with Apache AGE.

nichtich commented 1 year ago

Probably more flexible alternative is use of an RDF Triple store as backend (#31).