vaticle / typedb-driver

TypeDB Drivers for Rust, Python, Java, Node.js, C, C++, and C#.
https://typedb.com
Apache License 2.0
30 stars 32 forks source link

"gRPC message exceeds maximum size 4194304" and another, less helpful error when submitting long queries #334

Open alexjpwalker opened 3 years ago

alexjpwalker commented 3 years ago

Description

A long TypeQL query causes an error: "gRPC message exceeds maximum size 4194304". If the query gets really really long, the error message becomes less helpful: "Unable to connect to TypeDB server."

Environment

  1. OS (where TypeDB server runs): Mac OS 11
  2. TypeDB version (and platform): TypeDB 2.0

Proposed Solution

We have a couple of options here:

  1. We could simply reconfigure gRPC. Chances are the message size limit is configurable.

  2. A more involved, but perhaps better, solution would be to change the Protocol, making TypeQL query requests streamed. This would introduce a new paradigm into the Clients: client-side streaming. It wouldn't require any re-architecting, because we already use a bidirectional stream, so practically, all a Client needs to do is break down the query string into chunks and send them, and we could simply add a eof boolean flag to the Protocol for each Query request. The server simply has to put the chunks back together and execute the query when eof is seen.

Additional Information

This issue was reported by a couple of community members, most recently in the following scenario:

Running a TypeQL Insert query with 23,000 lines that looks like the following:

insert

$fdac46cf-2ffe-4a70-831d-586f3e1adb11 isa lassCorpJetEntry, has registration "02-01863", has aircrafttype "Gulfstream V (C-37A)", has constructionnumber "670", has notes "ex 02-1863";
$e7fecdbc-9a93-4d36-a7ab-0f180f1e0403 isa lassCorpJetEntry, has registration "2-", has aircrafttype "Hawker 800XP", has constructionnumber "258390", has notes "ex N736MB, N850HS, N800FD";
$231c2ef3-6914-4b90-94cc-5a43a79714ac isa lassCorpJetEntry, has registration "2-CAMP", has aircrafttype "Eclipse EA500", has constructionnumber "000011", has notes "ex N96PD, N80NE, N777VE";
FrankUrbach commented 3 years ago

Is this really needed? The boundary you hit with this limit is 4 MB. A query of 4 Million signs is not enough. If this is the case I feel something in the schema or in the modeling is wrong. Is it really worth to do this? Just my 2 cents.

lolski commented 3 years ago

I share your opinion @FrankUrbach - 4MB is big enough space for a query string.

However, I would still keep the issue open. Sending a long query and getting an "Unable to connect to TypeDB Server" is still not intuitive.

FrankUrbach commented 3 years ago

The message from gRPC should be passed through. The grpc client in Julia would throw an error because this size was exceeded. Tanmay Mohapatra has implemented the control of the size of one message because most of the servers have one. So the grpc client gives feedback about it.

alexjpwalker commented 3 years ago

Yeah I'm also starting to feel like the error message "Unable to connect" isn't serving its intended purpose anymore. We introduced it because it was a more user-friendly error than "Error 13 UNAVAILABLE: No connection established". In that scenario the simpler error message was better. But in many scenarios, it would be more useful to see the original gRPC error, perhaps exposing it as the exception cause.

FrankUrbach commented 3 years ago

Exposing the grpc error as cause is the way we do it in Julia. So you have more control over things which happens not inside the typed server or the clients itself.

arjayvillavicencio commented 2 years ago

My issue is I have a data.tql with a size of 4.4MB and having problem when inserting it to my graphdb.

1st Error: data.tql size is 4.4MB - “[CLI04] Client Error: Unable to connect to TypeDB server.”

2nd Error (I’m just using this for test, I’m still going to use the 1st one (data.tql 4.4MB)) data.tql size is 2.3MB (I reduced the original file size in half) - “gRPC message exceeds maximum size 4194304: 4834138”

How am I going to increase the gRPC limit on typedb console?

alexjpwalker commented 2 years ago

The gRPC message size limit is not (currently) configurable. The preferred approach is to shorten the query by breaking it into pieces; e.g: use individual insert statements instead of a single mega-insert.

izmalk commented 11 months ago

Another reproducer (originally to insert 1 million entities, but crashes even at 100 000:

from typedb.client import TypeDB, SessionType, TransactionType

print("Insert 1 million")

print("Connecting to the server")
with TypeDB.core_client("localhost:1729") as client:  # Connect to TypeDB server
    print("Connecting to the `1kk` database")
    with client.session("1kk", SessionType.DATA) as session:  # Access data in the `1kk` database as session
        print("\nRequest #1: Insert 1kk employees")
        with session.transaction(TransactionType.WRITE) as transaction:  # Open transaction to write
            query = "insert\n"
            for i in range(1, 1_00_001):
                s = str(i)
                query += "$e" + s + " isa employee, has full-name 'bob" + s + "', has email 'bob" + s + "@vaticle.com';\n"

            response = transaction.query().insert(query)  # Executing query
            transaction.commit()

print("Closing app")