typedb / typedb-driver-python

TypeDB Driver for Python
https://typedb.com
Apache License 2.0
67 stars 24 forks source link

read an complete .gql file | insert behaviour #41

Closed kristall closed 5 years ago

kristall commented 5 years ago

Problem to Solve

If you have an .gql file, that contains definitions and inserts then this causes an exception

with open("/path/grakn/grakn-core-all-linux-1.5.0/tests/foo.gpl") as f:
    definitions_and_inserts = f.read()

with GraknClient(uri="localhost:48555") as client:
    with client.session(keyspace="test") as session:
        with session.transaction().write() as write_transaction:
            write_transaction.query(definitions_and_inserts)
            write_transaction.commit() 

but from console this works:

./grakn console --keyspace test --file tests/foo.gql

Current Workaround

To avoid such behavior the original .gql file need to be split in two parts, definitions as one and inserts as the other like this:

with open("/path/grakn/grakn-core-all-linux-1.5.0/tests/foo_d.gpl") as f:
    definitions = f.read()

with open("/path/grakn/grakn-core-all-linux-1.5.0/tests/foo_i.gpl") as f:
    inserts = f.read()

with GraknClient(uri="localhost:48555") as client:
    with client.session(keyspace="test") as session:
        with session.transaction().write() as write_transaction:
            write_transaction.query(definitions)  
            write_transaction.query(insert)
            write_transaction.commit()

Now the definitions are committable, but unfortunately the insert part still fails with exceptions.

So one needs to do the following:

with open("/path/grakn/grakn-core-all-linux-1.5.0/tests/foo_i.gpl") as f:
    inserts = [e for e in f.read().split("\n\n") if e.strip() and not e.startswith("#")] 
    # Attention see below

with GraknClient(uri="localhost:48555") as client:
    with client.session(keyspace="test") as session:
        with session.transaction().write() as write_transaction:
            write_transaction.query(definitions)  
            for insert in inserts:
                write_transaction.query(insert)
            write_transaction.commit()

But since in the foo_i.gql some of the inserts are multilines (match\n..\n..insert\n..\n) one need to insert blank lines to make an easy split with little logic (not empty and not just a comment which also causes exceptions) and split at \n\n. This is painful since one needs to also split up single-line inserts with blank lines.

Example:

insert $n isa NNP, has Lemma "Geldstrafe";

insert $z isa NZP, has Lemma "Jahre", has Eigenschaft "bis zu fünf (<=5)";

match 
    $NO isa NEP, has Lemma "Wer";
    $DO isa NEP, has Lemma "anderer";
    $AO isa NNP, has Lemma "Sache";
    $PO isa NNP, has Lemma "in der Absicht";
insert
    $VP (NO: $NO, DO: $DO, AO: $AO, PO: $PO) isa VP;
    $VP has Lemma "wegnehmen";

match 
    $A1 isa NEP, has Lemma "Wer";
    $A2 isa NEP, has Lemma "Dritter";
insert
    $Alt (Alt_1: $A1, Alt_2: $A2) isa Alternative;
    $Alt has Lemma "oder";

(the blank line between the first two inserts is painful especially if you have a lot of them).

Proposed Solution

Since my guess is that you don't want to change the behaviour of write_transaction.query() it be great to have a write_transaction.from_file() which would take the path to the original foo.gql and read it without exceptions like:

with GraknClient(uri="localhost:48555") as client:
    with client.session(keyspace="test") as session:
        with session.transaction().write() as write_transaction:
            write_transaction.from_file("/path/grakn/grakn-core-all-linux-1.5.0/tests/foo.gpl")
            write_transaction.commit()
flyingsilverfin commented 5 years ago

Thanks for the issue and clear explanation :) This is something we definitely want to be able to do in the future, but as of right now it's on our longer term roadmap rather than short term.

The reason you can do this in console is that it uses java under the hood, which can in turn utilise the full language definition for Graql, including grammar parsing. The method that parses a series of queries is called Graql.parseList(String) and returns our native GraqlQuery types that can be fed directly into client-java.

As it stands, we don't have a full grammar parser in languages other than Java, though it is something we want to do, and would enable exactly the behavior you're looking for.

The shortest way to do it in pytho might actually be to call console -f from python! That way you can load all the static data in one go, and restrict the client for interactive database operations. How feasible is this for you? If you don't have the client on the same machine on the server, you may be able to write a tiny java program that encapsulates this functionality and also takes URI.

kristall commented 5 years ago

@flyingsilverfin Ah, you are right. Since I use ipython (or jupyter) notebook anyways I can just access the shell behind and call it from there, that will most likely work. Still having a function in the python client should be the long term solution, so shall we leave this issue open or close it?

flyingsilverfin commented 5 years ago

Let's shut this one and track it in Graql's repository directly. Thanks!