Open da-tanabe opened 3 years ago
Definitely in favor of making this a streaming endpoint/adding a streaming endpoint to preserve backwards compatibility.
Adding a streaming endpoint is definitely an option, and would match the event-streaming nature of our system. An alternative to consider is to expose views onto the state of the system using the resource oriented design of Google's Cloud APIs. Listing collections is paged there as per these design guidelines: https://google.aip.dev/132
@da-tanabe: thanks for reporting this. What is the priority on a fix for this?
(FYI: @gerolf-da @mziolekda )
We already do “pagination” on the ACS endpoint by using streaming there. Using something else for the parties endpoints doesn’t seem great from a user pov.
@meiersi-da it's a simple thing to work around (we just keep cranking up the limit), but it's definitely not ideal.
I agree with @cocreature that having multiple patterns in play wouldn't be great. I'd be favor of extending the current patterns of the current APIs for now, just so the immediate issue is dealt with in a stable way, and move the larger discussion under the context of new administrative APIs.
I'll also note that the same problems would occur on the current package endpoints, though I think those are going to be much harder to hit in practice.
OK. Extending the current pattern makes sense.
@bame-da do you consider a bug or a feature request? i.e., who owns prioritization for this?
I did a bit of digging here and, if working around in the Scala Client, I believe the right place to patch is here: https://github.com/digital-asset/daml/blob/11e5dd37e730156b97748d67914b5e3f14a48625/ledger/ledger-api-client/src/main/scala/com/digitalasset/ledger/client/services/admin/PartyManagementClient.scala#L54
If supplying a managed channel with a max size increased using https://grpc.github.io/grpc-java/javadoc/io/grpc/ManagedChannelBuilder.html#maxInboundMessageSize-int-, to the stub constructor, a subclass of https://grpc.github.io/grpc-java/javadoc/io/grpc/stub/AbstractStub.html#AbstractStub-io.grpc.Channel-, it's possible to work around this by increasing the arbitrary self imposed limit (4194304) to a larger one - a strategy we've taken with other non Scala client utilizing ledger accessing components - but I agree with @da-tanabe that a better strategic fix here would be to model the retrieval operation as a stream somehow.
@da-tanabe : would a solution that added pagination support following Google's API styleguide work for you https://cloud.google.com/apis/design/design_patterns#list_pagination?
@meiersi-da that would be fine, albeit inconsistent with the behavior of ActiveContractSetService
, CommandCompletionService
, TransactionService
, and TransactionTreeService
. I have a slight preference to keeping these endpoints consistent with each other simply because I don't see a compelling reason to treat them differently. The design pattern above supports static fetching, but less so keeping interested subscribers up-to-date over time.
I believe there were also discussions on using offset
in party and package streaming endpoints, though I can't seem to find a link; I'd be in favor of that to allow clients to more cheaply keep a notion of the universe of parties (and packages) up-to-date, especially to give these endpoints room to evolve different types of events.
Thanks for the detailed feedback @da-tanabe . You make a good case for adding streaming versions of these endpoints.
As I wrote in #12738, I don't think that breaking the API / Streaming is the first step to solve this. Taking the ActiveContractService
as an example, we do support a filter there.
My suggestion was something along the following lines:
message ListKnownPartiesRequest {
PartyFilter filter = 1;
int32 limit = 2; // optional field
}
Here, the filter
would be equivalent to the TransactionFilter
that we use in the ActiveContractService
.
Second, I would suggest adding an optional limit. In some cases, you want really to fetch everything. In most cases, you expect that the data set you read is limited and you want to indicate this with an explicit limit rather than to wait until a giant piece of data has been read from a database and wired to you.
Also, I would say that filter & streaming are not mutually exclusive features. We can fix the immediate problem by adding a non-breaking filter and then we can later on make it even more powerful by adding an additional streaming API.
Hi @daravep , thanks for the suggestions. I understand the semantics of the limit
field that your proposing, and believe that has low implementation and maintenance cost. How would a pragmatic PartyFilter
work that would solve your usecases, and what are these?
I have a hunch that there is a pragmatic change that we can make to this RPC so that it is no longer completely broken for larger sets of parties.
(FYI: @gerolf-da @resmela-da @mziolekda )
@meiersi-da in the concrete use where I am currently struggling with our scale test is the following: on startup, the application checks whether its party already exists. if not, it allocates it. for this, i have to query the displayNames, or just by convention for <name>::
.
so, the whole stuff currently runs without any degradation on the processing side, but if I restart the run (we can restart our performance runs which is good to test the restart capability of the system at scale).
however, i do think that it is generally a good idea on any function invocation to be able to hint what an expected data set size is such that you do not get an overflow or any other waste of computing resources.
@meiersi-da in the concrete use where I am currently struggling with our scale test is the following: on startup, the application checks whether its party already exists. if not, it allocates it. for this, i have to query the displayNames, or just by convention for
<name>::
.
For this check, why are you not using the GetParties
request given that you know what party you are looking for?
so, the whole stuff currently runs without any degradation on the processing side, but if I restart the run (we can restart our performance runs which is good to test the restart capability of the system at scale).
Good call. Would GetParties
solve your concrete problem?
however, i do think that it is generally a good idea on any function invocation to be able to hint what an expected data set size is such that you do not get an overflow or any other waste of computing resources.
Yes, I agree with that. It's more of a question of when and what.
@meiersi-da I don't know the party upfront. When I allocate the party, I allocate it with a hint, but then I get a party identifier back. The identifier contains the key fingerprint of the node which is not accessible via ledger Api. Of course, I could add persistence to the app and store that party identifier, and never hit the list parties end-point. However, instead of working around issues I was looking for a proper way to fix the issue.
Thanks for the explanation. In that case, I'd recommend using the UserManagementService
to store the stable-identifier-to-party-mapping. I'm thinking about the following steps:
User { userId: "your-fixed-id" primary_party: "<pid>" }
GetUser { userId: "your-fixed-id" }
to get ahold of the primary party for the user of interest.
A large number of parties onboarded onto any ledger will eventually cause clients that list the universe of parties to receive this lovely error (with the max size of course highly dependent on the number of parties):
The right solution is to probably change the
ListKnownParties
call to return astream
ofListKnownPartiesResponse
instead of a single object, and to chunk up the responses:https://github.com/digital-asset/daml/blob/11e5dd37e730156b97748d67914b5e3f14a48625/ledger-api/grpc-definitions/com/daml/ledger/api/v1/admin/party_management_service.proto#L38-L44
With the current API design, this endpoint will only ever return more data over time, and raising this limit globally, forever doesn't seem like a good idea.