INCATools / semantic-sql

SQL and SQLite builds of OWL ontologies
https://incatools.github.io/semantic-sql/
BSD 3-Clause "New" or "Revised" License
37 stars 3 forks source link

create predicate tables #3

Closed wdduncan closed 3 years ago

wdduncan commented 3 years ago

@cmungall We've discussed this before, but putting it in a ticket in case others want to weigh in.

For each predicate (i.e., object/data/annotation property) you may want to consider creating a separate table instead of having a predicate column. You can (of course) create views for each predicate, but AFIK sqlite does not support materialized views, and overuse of views can affect performance.

For instance, suppose you have a part_of table. A query to find everything that an entity is part of would look something like:

select
   s.subject, p.objects 
from 
  statements s
inner join 
  part_of p 
on 
  s.subject = p.subject

A query for your rdfslabel would be:

select distinct 
  s.subject, l.value, count(*)
from 
  statements s
inner join
  rdfslabel l
on
  s.subject = l.subject
group by
  s.subject, l.value
having
  count(*) > 1

These queries can (of course) be made into views.

Also, using predicate tables may allow for better handling of datatypes. That is, you may be able to specify if something is a sting, int, datetime, etc., depending on the table. For instance, if the part_of table always has a CURIE as the object, you can declare the datatype for the column as sting and check that the expression conforms. This could be quite helpful for datetime values, which I find myself often having to things like pull out the year or months parts, and order.

cmungall commented 3 years ago

create suggestion. It would be easy to autogenerate these from e.g. RO (in fact we did this wth OBD many years back..)

We might want to create separate views for existentials and aboxes - or just have a view over the edge table

views are great. performance wise they are identical to doing the more verbose expanded query, so there is no performance loss in using them. And they can always be materialized by a CREATE TABLE statement (yes, not a MV sensu stricto but good enough especially for warehouse type databases)

wdduncan commented 3 years ago

Cool :)

I assume by OBD you mean Ontology-Based Database. Is any of that code base reusable?

cmungall commented 3 years ago

See https://github.com/cmungall/semantic-sql#previous-work but I wouldn't spend too much time looking at it except as a curiosity. @balhoff still has nightmares about it

cmungall commented 3 years ago

this was the procedure for materializing views:

https://github.com/phenoscape/OBDAPI/blob/16f7bae75184bc438b21eb91a0cebef80438071f/sql/obd-matview-funcs.sql#L17-L52

balhoff commented 3 years ago

And the first reasoner I ever encountered! 😀

https://github.com/phenoscape/OBDAPI/blob/master/scripts/obd-reasoner.pl