I was just on a call with GS. They are leery about using Daml Script. One reason is they encountered the following situation:
Running their Daml / Canton system in the cloud
Were using Daml Script and, I assume, running it locally or on a server in the cloud
The connection from the host running Daml Script to the PN broke
This connection breakage made an unrecoverable situation
A network (socket) connection breaking is a very normal thing. Does Daml Script have any retry mechanism for this? Are there any other gaps in making that connection robust?
Another related situation was a JWT token expiring in the middle of a Script execution. What would happen in this case?
Netty retry findings
It appears that daml-script's underlying Netty Channel when using GRPC doesn't use any retry logic.
The channel builder LedgerClientChannelConfiguration.builderFor sets up TLS and message sizes, but doesn't setup:
.enableRetry()
.maxRetryAttempts(10)
for example.
There is also information here on retry directly in GRPC.
I've seen a sentiment online that maxRetryAttempts isn't enough.
From Curtis, a description of the problem:
Netty retry findings
It appears that daml-script's underlying Netty Channel when using GRPC doesn't use any retry logic. The channel builder
LedgerClientChannelConfiguration.builderFor
sets up TLS and message sizes, but doesn't setup:for example. There is also information here on retry directly in GRPC. I've seen a sentiment online that
maxRetryAttempts
isn't enough.