ListQueues can lead to gRPC error when the number of queues is large
java -jar ./target/urlfrontier-client*.jar ListQueues
io.grpc.StatusRuntimeException: RESOURCE_EXHAUSTED: gRPC message exceeds maximum size 4194304: 10945786
at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:262)
at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:243)
at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:156)
at crawlercommons.urlfrontier.URLFrontierGrpc$URLFrontierBlockingStub.listQueues(URLFrontierGrpc.java:604)
at crawlercommons.urlfrontier.client.ListQueues.run(ListQueues.java:52)
at picocli.CommandLine.executeUserObject(CommandLine.java:1939)
at picocli.CommandLine.access$1300(CommandLine.java:145)
at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2346)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2311)
at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179)
at picocli.CommandLine.execute(CommandLine.java:2078)
at crawlercommons.urlfrontier.client.Client.main(Client.java:40)
Jun 23, 2021 12:58:07 PM io.grpc.internal.AbstractClientStream$TransportState inboundDataReceived
INFO: Received data on closed stream
Limiting the size of the returned list, e.g. via java -jar ./target/urlfrontier-client*.jar ListQueues -n 1000 avoids the exception. However, since results are apparently returned in a consistent order and the -n only controls the number of items from the start of the list, this makes it difficult for a client to obtain the tail part of the list.
We should paginate the results and return a richer output with: total number of queues, start and end offsets etc...
ListQueues can lead to gRPC error when the number of queues is large
Limiting the size of the returned list, e.g. via
java -jar ./target/urlfrontier-client*.jar ListQueues -n 1000
avoids the exception. However, since results are apparently returned in a consistent order and the-n
only controls the number of items from the start of the list, this makes it difficult for a client to obtain the tail part of the list.We should paginate the results and return a richer output with: total number of queues, start and end offsets etc...