weaviate / typescript-client

Official Weaviate TypeScript Client
https://www.npmjs.com/package/weaviate-client
BSD 3-Clause "New" or "Revised" License
57 stars 21 forks source link

gRPC: Receiving RST_STREAM on insertMany #154

Open riepspellz opened 6 days ago

riepspellz commented 6 days ago

Hello,

I'm actually migrating the client version (V2 -> V3) and when I try to batch insert a lot of documents (~1000), I get this error:

WeaviateBatchError: Batch objects insert failed with message: /weaviate.v1.Weaviate/BatchObjects INTERNAL: Received RST_STREAM with code 0 (Call ended without gRPC status)
      at /Users/me/app/node_modules/weaviate-client/dist/node/esm/grpc/batcher.js:42:17
      at processTicksAndRejections (native:1:1)

Batch objects insert failed with message: /weaviate.v1.Weaviate/BatchObjects INTERNAL: Received RST_STREAM with code 0 (Call ended without gRPC status)
37 |             metadata: this.metadata,
38 |             signal,
39 |           }
40 |         )
41 |         .catch((err) => {
42 |           throw new WeaviateBatchError(err.message);
                     ^
error: WeaviateBatchError: Batch objects insert failed with message: /weaviate.v1.Weaviate/BatchObjects INTERNAL: Received RST_STREAM with code 0 (Call ended without gRPC status)
      at /Users/me/app/node_modules/weaviate-client/dist/node/esm/grpc/batcher.js:42:17
      at processTicksAndRejections (native:1:1)

This error doesn't happen every time. It tends to occur more often when I try to insert larger documents, but the same insert can either be successful or lead to a crash.

I'm running locally, using Weaviate version 1.25.3 in a Docker container, and I use the weaviate-client v3.0.8 in an ESM Express app powered by Bun, and i bring my own vectors.

Here's my configuration:

const WeaviateClient = await weaviate.connectToCustom({
    // HTTP
    httpHost: config.weaviateURI,
    httpPort: 5000,
    httpSecure: false,

    // GRPC
    grpcHost: config.weaviateURI,
    grpcPort: 50051,
    grpcSecure: false,
});

Here's the logs from the docker i get after a failed request (LOG_LEVEL is set to "debug"):

{"action":"restapi_request","level":"debug","method":"GET","msg":"received HTTP request","time":"2024-06-25T13:34:49Z","url":{"Scheme":"","Opaque":"","User":null,"Host":"","Path":"/v1/schema","RawPath":"","OmitHost":false,"ForceQuery":false,"RawQuery":"","Fragment":"","RawFragment":""}}
{"level":"debug","msg":"server.query","time":"2024-06-25T13:34:49Z","type":2}
{"action":"lsm_init_disk_segment_build_bloom_filter_primary","class":"my_collection","index":"my_collection","level":"debug","msg":"building bloom filter took 208.625µs\n","path":"/var/lib/weaviate/my_collection/pN5G1xK1OOMa/lsm/property_vector/segment-1719322474400489347.db","shard":"pN5G1xK1OOMa","time":"2024-06-25T13:35:00Z","took":208625}
{"action":"lsm_memtable_flush_complete","class":"my_collection","index":"my_collection","level":"debug","msg":"flush and switch took 1.420272459s\n","path":"/var/lib/weaviate/my_collection/pN5G1xK1OOMa/lsm/property_vector","shard":"pN5G1xK1OOMa","time":"2024-06-25T13:35:00Z","took":1420272459}
riepspellz commented 5 days ago

Also, I couldn't figure out why, but sometimes, the insertMany doesn't work; I'll never get any response from Weaviate until the socket hangs up.

I couldn't really reproduce it, but I think it happens when I first create a schema, and then the crash I mentioned above occurs. When I try to insert again, this issue happens. When this occurs, a single insert fixes it, and I'm able to do an insertMany afterward.