prathyvsh / database-readings

A collection of resources to construct databases with an emphasis on performing abstract/combinatorial searches
8 stars 1 forks source link

The following is a collection of resources to construct databases. My initial line of research is in constructing a database that allows for abstraction and combinators in constructing the search clauses.

** Tools

*** [[https://github.com/alf-tool/alf][Alf]] Alf is a modern query language, rooted in relational algebra, and implemented as a Ruby Domain Specific Language.

** [[https://github.com/datalogui/datalog][Datalog UI]] An implementation of Datalog with a focus on managing UIs & UI state.

Following is a collection of resources for understanding the construction of Datalog:

** [[https://www2.cs.sfu.ca/CourseCentral/721/jim/DatalogPaper.pdf][What You Always Wanted to Know About Datalog (And Never Dared to Ask)]] Looks like a neat paper on the internals of Datalog.

** [[https://www.arrdem.com/2018/05/17/shelving_building_a_datalog/][Building a Datalog for Fun! and Profit?]]

Speaker notes on constructing a Datalog in Clojure. [[Video is also available.][https://youtu.be/U4ckAGtv_OI]]

** [[https://dodisturb.me/posts/2018-12-25-The-Essence-of-Datalog.html][The Essence of Datalog]]

A blogpost on constructing a simple Datalog engine in Haskell to understand its semantics.

** [[http://aosabook.org/en/500L/an-archaeology-inspired-database.html][An Archaeology Inspired Database]] A blogpost on building a database with Datalog in Clojure in less than 500 lines of code. h/t [[https://twitter.com/toxi][Karsten Schmidt]]

** [[https://www3.cs.stonybrook.edu/~liu/papers/Rules-TOPLAS09.pdf][From Datalog Rules to Efficient Programs with Time and Space Guarantees]] An article on transforming Datalog queries for maximum efficiency in time and space dimensions.

h/t [[https://twitter.com/willb][William Benton]]

** [[https://github.com/frankmcsherry/blog/blob/master/posts/2018-05-19.md][A relatively simple Datalog engine in Rust]]

** Books

[[http://webdam.inria.fr/Alice/][Foundations of Databases / Alice Book]] [[https://amzn.to/34XH0ve][Foundations of deductive databases and logic programming]]

** To Investigate

*** History of Datalog

I kind of know that Datalog have Horn Clauses as their primitives but need to investigate exactly how they figure in terms of representation and generality in logic. What kind of logic are they capable of expressing with them? What kind of lattice/structure is their underlying model? What does this preclude / exclude? How does Blake Canonical Form link with Horn Clauses? Is minimization of boolean algebra somehow related to unification?

** [[http://web.cs.iastate.edu/~hridesh/teaching/362/07/01/papers/p50-liskov.pdf][Programming with Abstract Data Types (1974)]] Barbara Liskov, Stephen Zilles

** [[https://www.cs.utexas.edu/~wcook/Drafts/2009/essay.pdf][On Understanding Data Abstraction, Revisited (2009)]] William Cook

Slides: http://web.cecs.pdx.edu/~black/OOP/slides/Cook%20on%20DataAbstraction.pdf

** [[https://www.ijcai.org/Proceedings/16/Papers/619.pdf][Tabling as a Library with Delimited Control]] Use of continuations for enabling tabling support in Prolog

** Complexity and Datalog https://iccl.inf.tu-dresden.de/w/images/3/33/DBT2016-Lecture-10.pdf https://www.scottaaronson.com/democritus/lec19.html

** Material on Data Integration from Stanford http://logic.stanford.edu/dataintegration/

** [[https://www.sti-innsbruck.at/sites/default/files/thesis/christoph-fuchs-thesis-final-09-2008.pdf][Extension of a Datalog Reasonerwith Top-Down Evaluation]]

*** Doop https://plast-lab.github.io/doop-pldi15-tutorial/

*** [[https://souffle-lang.github.io/][Soufflé]]

Dedalus Papers: [[https://dsf.berkeley.edu/papers/datalog2011-dedalus.pdf][Dedalus: Datalog in Time and Space]]

*** [[http://bloom-lang.net/][Bloom]]

Talks by Peter Alvaro: [[https://www.youtube.com/watch?v=R2Aa4PivG0g][I See What You Mean]] [[https://channel9.msdn.com/Events/Lang-NEXT/Lang-NEXT-2012/Bloom-Disorderly-Programming-for-a-Distributed-World][Bloom: Disorderly Programming for a Distributed World]]

*** Datafun

**** [[https://www.youtube.com/watch?v=gC295d3V9gE][Datafun: a functional query language]]

** Interesting use cases

*** Type checker

**** [[https://users.soe.ucsc.edu/~cormac/papers/ppdp05.pdf][Automatic Type Inference via Partial Evaluation]]

**** [[https://petevilter.me/post/datalog-typechecking/][Datalog Typechecking]]

**** [[https://github.com/HarvardPL/formulog][Formulog]] [[http://www.weaselhat.com/2020/08/07/formulog-ml-datalog-smt/][Blogpost]]

*** Programming Synthesis

**** [[http://pages.cs.wisc.edu/~aws/papers/cp17.pdf][Constraint-Based Synthesis of Datalog Programs]]

**** [[http://pages.cs.wisc.edu/~aws/papers/fse18b.pdf][Syntax-Guided Synthesis of Datalog Programs]]

*** Disassembly

[[https://www.usenix.org/system/files/sec20fall_flores-montoya_prepub_0.pdf][Datalog Disassembly]] [[https://github.com/GrammaTech/ddisasm][Repo]]

*** Application in Neural Networks

**** [[https://arxiv.org/abs/2006.16723][Neural Datalog Through Time: Informed Temporal Modeling via Logical Specification]]

Linked data, triple store, and the RDF movement in the web space is well worth understanding to know how it evolved and failed to garner the traction to become mainstream. I feel there’s some good work done in this field.

** [[http://linkeddatabook.com/editions/1.0/][Linked Data Book]] Book on the ethos of linked data. h/t [[https://twitter.com/toxi][Karsten Schmidt]]

** [[https://software-carpentry.org/blog/2011/03/tuple-spaces-or-good-ideas-dont-always-win.html][Tuple Spaces (or, Good Ideas Don't Always Win)]]

** [[https://livingroomresearch.tumblr.com/post/171022587132/living-room-in-context][Living Room in Context]]

** [[https://www.cs.vu.nl/~eliens/media/@archive/refs/summBob.pdf][Logic Programming Languages for the Internet]]

** [[https://people.eecs.berkeley.edu/~sylvia/papers/hotos_2013.pdf][Large-Scale Computation Not at the Cost of Expressiveness]]