Open jon-whit opened 1 year ago
Hi, it doesn't support distributed facts out of the box but it's possible.
I can think of 2 approaches so far:
We can see a Prolog interpreter as a state machine for raft and a query as a log.
We can insert and delete facts into a dynamic predicate with assertz/1
and retract/1
respectively. If we update the predicate only via raft, all the nodes will be in sync.
?- assertz(likes(yutaka, sushi)).
true.
?- assertz(likes(yutaka, pizza)).
true.
?- likes(Who, What).
What = sushi,
Who = yutaka;
What = pizza,
Who = yutaka;
?- retract(likes(_, pizza)).
true.
?- likes(Who, What).
What = sushi,
Who = yutaka;
We can write custom predicates to wrap a distributed data store.
likes(Who, What) :- db_get("likes/2", [Who, What]).
assert_likes(Who, What) :- db_put("likes/2", [Who, What]).
p := prolog.New(nil, nil)
p.Register2(engine.NewAtom("db_put"), func(vm *engine.VM, key, value engine.Term, cont engine.Cont, env *engine.Env) *engine.Promise {
// Writes to the distributed data store.
})
p.Register2(engine.NewAtom("db_get"), func(vm *engine.VM, key, value engine.Term, cont engine.Cont, env *engine.Env) *engine.Promise {
// Reads from the distributed data store.
})
@ichiban I'll have to explore those options a little more and familiarize myself with the predicates design in general. It seems that both of these options would end up going over the network to fetch a predicate's term and value for each and every predicate term involved in the prolog query. That would be potentially extremely chatty for even a moderately sized query.
I'm wondering if the same is possible using local storage through a key/value store such as badgerdb, bbolt, or SQLite. Then the database could be replicated through some other means. For example, for SQLite one could use rqlite for more regional replication. This way you avoid the network hop to evaluate each and every term and keep the lookup more local. With NVMe storage these days, and a solid key/value database implementation using it, then these queries could be very fast, persistent, and still replicated in a reasonably consistent manner.
It stores dynamic predicates in memory. So the former reads facts from the snapshot in the local memory.
The latter can be either remote or local depending on which data store you choose. Since you wanted it persistent and distributed, I had etcd and rqlite in mind. If the data store is distributed, the reads will be from your local machine like you mentioned.
Either way, you can avoid reads over network.
@ichiban what am I doing wrong here?
package main
import (
"log"
"strings"
"github.com/ichiban/prolog"
"github.com/ichiban/prolog/engine"
_ "github.com/mattn/go-sqlite3"
)
func main() {
reader := strings.NewReader(`
assert_likes(Who, What) :- db_put("likes/2", [Who, What]).`)
p := prolog.New(reader, nil)
p.Register2(engine.NewAtom("db_put"), func(vm *engine.VM, key, value engine.Term, cont engine.Cont, env *engine.Env) *engine.Promise {
// Writes to the distributed data store.
log.Printf("(key term) %v, (value term) %v\n", key, value)
return cont(env)
})
err := p.Exec(`assert_likes("jon", "pizza").`)
if err != nil {
log.Fatalf("assert_likes failed with error: %v", err)
}
}
I don't see the print statement from the custom predicate.
@jon-whit Sorry for the late reply.
Because I made not-so-great design decisions on the interface and naming, you failed to load the Prolog text and execute it.
To load a Prolog text, you can use Exec()
and to execute it, you can use Query()
, QuerySolution()
, and their variants with context.Context
.
package main
import (
"log"
"github.com/ichiban/prolog"
"github.com/ichiban/prolog/engine"
)
func main() {
// Feeding a Prolog text as user_input doesn't automatically load.
p := prolog.New(nil, nil)
p.Register2(engine.NewAtom("db_put"), func(vm *engine.VM, key, value engine.Term, cont engine.Cont, env *engine.Env) *engine.Promise {
// An idiom to convert an atom to a Go string.
var table string
switch k := env.Resolve(key).(type) {
case engine.Variable:
return engine.Error(engine.InstantiationError(env))
case engine.Atom:
table = k.String()
default:
return engine.Error(engine.TypeError(engine.NewAtom("atom"), k, env))
}
var vals []any
iter := engine.ListIterator{List: value, Env: env}
for iter.Next() {
// Convert atomic terms to nil, string, int64, or float64.
switch v := env.Resolve(iter.Current()).(type) {
case engine.Variable:
vals = append(vals, nil)
case engine.Atom:
vals = append(vals, v.String())
case engine.Integer:
vals = append(vals, int64(v))
case engine.Float:
vals = append(vals, float64(v))
default: // i.e. engine.Compound
return engine.Error(engine.TypeError(engine.NewAtom("atomic"), v, env))
}
}
if err := iter.Err(); err != nil {
return engine.Error(err)
}
// Writes to the distributed data store.
// e.g. db.Exec(`INSERT INTO likes(who, what) VALUES (?, ?)`, vals...)
log.Printf("(key term) %v, (value term) %v\n", table, vals)
return cont(env)
})
// We can use Exec() to feed a Prolog text.
// Exec() was a bad naming. It doesn't execute but load the prolog text.
// I tried to imitate database/sql so that it looks as approachable as SQL.
if err := p.Exec(`assert_likes(Who, What) :- db_put('likes/2', [Who, What]).`); err != nil {
log.Fatalf("exec failed with error: %v", err)
}
// Now we can Query*() to actually execute it.
// In this case, we can use QuerySolution() since we are interested in a single solution.
if err := p.QuerySolution(`assert_likes('jon', 'pizza').`).Err(); err != nil {
log.Fatalf("assert_likes failed with error: %v", err)
}
}
I am also interested in storing all facts in a database (SQL / NoSQL). My first use case would be a stateless web service, but where I would store the facts in a Redis so that I can run multiple instances of the service on the same fact base. In my case I don't want to deal with assert*
so it must / should be transparent. The rules would be static from a script.
Perhaps a tip on where I could start in the code would be nice, then I would take a look at it if necessary.
I'm interested in using ichiban/prolog for a project, but I need to be able to persist and distribute prolog facts. My use case requires a multi-region deployment topology for high availability, but my application requirements fit well for a prolog derived query model.
Would something like this be possible for this project? If so, where would I start? I'd like to contribute if so.