filecoin-project / specs-actors

DEPRECATED Specification of builtin actors, in the form of executable code.
Other
86 stars 102 forks source link

[Deal Making Issue] Single bad deal during publish batch deals task ruins all batched deals #1466

Open stuberman opened 3 years ago

stuberman commented 3 years ago

Basic Information Any bad deals should not cause all other deals during batch publish to fail. Simply skip and error the bad deal(s).

Describe the problem

Received four online deals which transferred successfully to miner, but upon publish batch deals publish, one deal failed and all four deals were errored out.

Version

lotus-miner version 1.10.0+mainnet+git.764fa9dae

Setup

To Reproduce Steps to reproduce the behavior: 1.Accept online deals and use publish batch feature. lotus-miner storage-deals pending-publish --publish-now

  1. See error

Deal status

Jun 28 03:47:52 true bafyreie33qicnkju72vz27fe4lybkxuh3xuqstschi3xq22i2zdrizxihy 0 StorageDealError f3rmy6c7zoefq3bb4thhvmv35ihe4t35hyaivoxdrzzmtt5zx6hrdp4ltgzccndggynjkc3joxxok3quv5mg2a 32GiB 0 FIL 1458616 12D3KooWQrktQtwVbMBoeoN6te6i26s1gQPVfTYVgFs1U4ePmpys-12D3KooWQvdhZSSArWTSBiyjVZa3mAmzoLHXV2QS9nFoaGLv3fL7-1624609546043369505 error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17) Jun 28 08:02:58 true bafyreif4s5htqde4j2rixmzmsk6bn3mseuwd62wyrru2y4v42lkj4orxtu 0 StorageDealError f3rmy6c7zoefq3bb4thhvmv35ihe4t35hyaivoxdrzzmtt5zx6hrdp4ltgzccndggynjkc3joxxok3quv5mg2a 32GiB 0 FIL 1458106 12D3KooWQrktQtwVbMBoeoN6te6i26s1gQPVfTYVgFs1U4ePmpys-12D3KooWQvdhZSSArWTSBiyjVZa3mAmzoLHXV2QS9nFoaGLv3fL7-1624609546043369515 error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17) Jun 28 09:01:48 true bafyreiewg5yn4id76xry36sy33o2hljhlcewudv6hbwk6plsozu2pug42e 0 StorageDealError f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q 16GiB 0 FIL 1063427 12D3KooWHRm8wKqXdBKDS4QkJmdUPNTqRSFRp7ZHg8MphRWbZXCs-12D3KooWQvdhZSSArWTSBiyjVZa3mAmzoLHXV2QS9nFoaGLv3fL7-1624789013145171951 error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17) Jun 28 12:24:06 true bafyreicvrblu3js7egclarj7xefb2bi2dnil4rixghkuq645nswtwz4ope 0 StorageDealError f3rmy6c7zoefq3bb4thhvmv35ihe4t35hyaivoxdrzzmtt5zx6hrdp4ltgzccndggynjkc3joxxok3quv5mg2a 32GiB 0 FIL 1457584 12D3KooWQrktQtwVbMBoeoN6te6i26s1gQPVfTYVgFs1U4ePmpys-12D3KooWQvdhZSSArWTSBiyjVZa3mAmzoLHXV2QS9nFoaGLv3fL7-1624609546043369525 error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17)

Lotus daemon and miner logs

2021-06-28T13:36:19.439Z INFO storageadapter storageadapter/dealpublisher.go:155 force publishing deals 2021-06-28T13:36:19.442Z INFO storageadapter storageadapter/dealpublisher.go:322 publishing 4 deals in publish deals queue with piece CIDs: baga6ea4seaqd5sqzr5kul7rj6gqt3sp7nnlcgesrekt3r7bilvptsulxlsdxsky, baga6ea4seaqpc47lomegohzmetbrweagqbqpccewznolhnedtc2hezukpacemfa, baga6ea4seaqlwa2h5eanvkgmm3qngab3db5lmgtd3rmncffaasmdxl54ajnyalq, baga6ea4seaqofktmp43s4r4nl4ulh5ykje4gl5yf7vct4ofuksvxkwxahf46aei 2021-06-28T13:36:19.450Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventNodeErrored", "proposal CID": "bafyreicvrblu3js7egclarj7xefb2bi2dnil4rixghkuq645nswtwz4ope", "state": "StorageDealFailing", "message": "error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17)"} 2021-06-28T13:36:19.450Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventNodeErrored", "proposal CID": "bafyreie33qicnkju72vz27fe4lybkxuh3xuqstschi3xq22i2zdrizxihy", "state": "StorageDealFailing", "message": "error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17)"} 2021-06-28T13:36:19.450Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventNodeErrored", "proposal CID": "bafyreif4s5htqde4j2rixmzmsk6bn3mseuwd62wyrru2y4v42lkj4orxtu", "state": "StorageDealFailing", "message": "error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17)"} 2021-06-28T13:36:19.450Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventNodeErrored", "proposal CID": "bafyreiewg5yn4id76xry36sy33o2hljhlcewudv6hbwk6plsozu2pug42e", "state": "StorageDealFailing", "message": "error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17)"} 2021-06-28T13:36:19.463Z WARN providerstates providerstates/provider_states.go:536 deal bafyreicvrblu3js7egclarj7xefb2bi2dnil4rixghkuq645nswtwz4ope failed: error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17) 2021-06-28T13:36:19.475Z WARN providerstates providerstates/provider_states.go:536 deal bafyreiewg5yn4id76xry36sy33o2hljhlcewudv6hbwk6plsozu2pug42e failed: error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17) 2021-06-28T13:36:19.476Z WARN providerstates providerstates/provider_states.go:536 deal bafyreif4s5htqde4j2rixmzmsk6bn3mseuwd62wyrru2y4v42lkj4orxtu failed: error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17) 2021-06-28T13:36:19.476Z WARN providerstates providerstates/provider_states.go:536 deal bafyreie33qicnkju72vz27fe4lybkxuh3xuqstschi3xq22i2zdrizxihy failed: error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17) 2021-06-28T13:36:19.534Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventFundsReleased", "proposal CID": "bafyreicvrblu3js7egclarj7xefb2bi2dnil4rixghkuq645nswtwz4ope", "state": "StorageDealFailing", "message": "error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17)"} 2021-06-28T13:36:19.547Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventFailed", "proposal CID": "bafyreicvrblu3js7egclarj7xefb2bi2dnil4rixghkuq645nswtwz4ope", "state": "StorageDealError", "message": "error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17)"} 2021-06-28T13:36:19.620Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventFundsReleased", "proposal CID": "bafyreiewg5yn4id76xry36sy33o2hljhlcewudv6hbwk6plsozu2pug42e", "state": "StorageDealFailing", "message": "error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17)"} 2021-06-28T13:36:19.624Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventFailed", "proposal CID": "bafyreiewg5yn4id76xry36sy33o2hljhlcewudv6hbwk6plsozu2pug42e", "state": "StorageDealError", "message": "error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17)"} 2021-06-28T13:36:19.718Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventFundsReleased", "proposal CID": "bafyreif4s5htqde4j2rixmzmsk6bn3mseuwd62wyrru2y4v42lkj4orxtu", "state": "StorageDealFailing", "message": "error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17)"} 2021-06-28T13:36:19.724Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventFailed", "proposal CID": "bafyreif4s5htqde4j2rixmzmsk6bn3mseuwd62wyrru2y4v42lkj4orxtu", "state": "StorageDealError", "message": "error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17)"} 2021-06-28T13:36:19.853Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventFundsReleased", "proposal CID": "bafyreie33qicnkju72vz27fe4lybkxuh3xuqstschi3xq22i2zdrizxihy", "state": "StorageDealFailing", "message": "error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17)"} 2021-06-28T13:36:19.866Z INFO markets loggers/loggers.go:20 storage provider event {"name": "ProviderEventFailed", "proposal CID": "bafyreie33qicnkju72vz27fe4lybkxuh3xuqstschi3xq22i2zdrizxihy", "state": "StorageDealError", "message": "error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17)"}

Code modifications

None

f8-ptrk commented 3 years ago

@stuberman the message execution fails if one deal in the batch fails. if we let one deal fail and all other succeed - whats the status of the message: success or failed? we only have binary message outcomes available. the lotus-miner depends on the return status of the message to detect f the deal/s fail to execute on chain or not and act on it lotus-miner side

stuberman commented 3 years ago

That is a design problem. My suggestion is that when a deal fails it does not get included in the message and that the rest of the deals are published.

The bigger principle when batching messages, is not to sacrifice all of the deals/precommits/commits when a single deal/etc fails for some reason. Better to orphan a bad deal than run many good deals. I recently lost 8 deals when one small deal was claimed to be verified but upon publish message processing was not verified.

Another alternative to changing the publish message process is to do more checking of deals/etc prior to acceptance so the failure would not occur so late in the process.

f8-ptrk commented 3 years ago

basically a "if one then all" success for deals in the publish message and a lotus-miner side check with the market actor what deals are actually on chain to not process deals that failed

[edit]

all the checks miner side are "worthless" in the end - nothing need to be locked or committed until the deals are published. assuming bad clients. sure checking makes sense but in the end f04 ? (5?) is the ultimate source of truth if we cannot rely on the message return

stuberman commented 3 years ago

Discard bad deals. Either precheck deals prior to processing into batch publish, or when performing batch publish simply discard deals from the batch publish message. The system must be more resilient to failures due to bad deals.

f8-ptrk commented 3 years ago

a deal is bad exactly when the message execution fails - before that it's shaking the magic 8ball. and a message cannot be altered after being signed. these two facts create a problem

[edit]

sure, publishing few false positives can be changed to pre sorting with a few false negatives (regarding bad deals) - do we have data on how often this actually happens? it's an attack vector on a miners resources that is not insignificant

f8-ptrk commented 3 years ago

what i can imagine is putting the costs of a failed publish onto the client that signed the proposal. if a client signs the proposal we can expect him to honor his signature and if he doesn't he should be responsible for the costs a miner has to endure due to his failure to honor the signed proposal

a 2 step on chain deal process instead of a single publish

[edit]

a great way to put the deal publishing costs on the client btw. :)

stuberman commented 3 years ago

The deal is bad when the client offered a deal that claimed to be verified and was not (or was unable to maintain integrity later). The deal should have been stopped at the client. But the miner could have checked the deal prior to acceptance, but did not.

The problem is the design of the work flow allowing bad deals to be accepted deeper into the process and not having a means to drop them during the process.

The design needs to serve the stakeholders' needs. Use a real world analogy of buying a house with a mortgage. The checks are performed up front. When a servicing company buys a group of mortgages, they don't all get nullified and discarded due to a problem with one uncovered later in the process.

f8-ptrk commented 3 years ago

The deal is bad when the client offered a deal that claimed to be verified and was not (or was unable to maintain integrity later). The deal should have been stopped at the client. But the miner could have checked the deal prior to acceptance, but did not.

The problem is the design of the work flow allowing bad deals to be accepted deeper into the process and not having a means to drop them during the process.

The design needs to serve the stakeholders' needs. Use a real world analogy of buying a house with a mortgage. The checks are performed up front. When a servicing company buys a group of mortgages, they don't all get nullified and discarded due to a problem with one uncovered later in the process.

but that's all off chain lgoic. that's not changing the fact that if a client want' you to fail the publish you will fail the publish by just passing you random bad deals that are crafted to go through lotus but fail on chain

[edit]

a proposal is a cheque and it will fail exactly when you try to hand it in at the bank. no matter what you do before that point.

f8-ptrk commented 3 years ago

the failing is an on chain problem we will not solve with off chain measures

stuberman commented 3 years ago

That is my point. The code and design need to take into account issues prior to being put on the chain. Don't put bad messages on the chain. Don't process bad deals. Validate checks prior to depositing them. Reserve funds as part of the system. If the design is flawed, fix the design.

f8-ptrk commented 3 years ago

the problem is: as soon as i see the publish message in the mpool i can make the deal bad. no matter what you do before hand.

if my market actor withdraw gets executed before your publish in the same block your deal goes bad

[edit]

or i propose the same deal X times but only have resources to do it once. then X miners will go all pre-checks without problems and as soon as one miner publishes successfully X-1 miners probably get f'ed

f8-ptrk commented 3 years ago

i am totally on line with "do more pre publish checking" lotus-miner side when dealing with deals. but ultimately it will not solve the problem that the client signed a proposal and doesn't need to honor it without getting punished for doing so

f8-ptrk commented 3 years ago

If the design is flawed, fix the design.

it is and it needs to be fixed

but it needs to be fixed "on chain" - everything else is just covering for the design flaw

stuberman commented 3 years ago

as soon as i see the publish message in the mpool i can make the deal bad.

That is a flawed design. What is far more flawed as a system design is that a bad deal can affect other good deals.

Can you explain to me the logic of how a batch publish task is able to discover that a deal is bad (prior to publishing the message) and cannot simply discard the one bad deal from the batch of good deals that should be published? Obviously there is a mechanism to verify a deal prior to publish (which is why the entire batch was discarded).

Here is a real world case where my miner accepted 9 deals, downloaded hundreds of gigabytes of deals waiting hours to prepare all of the deals for clients. Then due to a single small deal which could not pass verification prior to publishing the deal as part of a larger batch, all 9 deals are errored out. Bad for clients, bad for my reputation, bad for wasting a lot of resources. Bad for Filecoin.

f8-ptrk commented 3 years ago

Can you explain to me the logic of how a batch publish task is able to discover that a deal is bad (prior to publishing the message)

you can't. thats the basic problem. you can check all the params, enough datacap, enough funds, etc. but if they are not there at execution time the publish fails.

Obviously there is a mechanism to verify a deal prior to publish (which is why the entire batch was discarded).

nope. the whole batch gets discarded when the execution hit's a bad deal

f8-ptrk commented 3 years ago

what i can imagine is putting the costs of a failed publish onto the client that signed the proposal. if a client signs the proposal we can expect him to honor his signature and if he doesn't he should be responsible for the costs a miner has to endure due to his failure to honor the signed proposal

  • locking funds in an actor as soon as the miner signs the proposal, basically pre registering the proposal on chain
  • charge the these funds if the publish fails

a 2 step on chain deal process instead of a single publish

[edit]

a great way to put the deal publishing costs on the client btw. :)

thats why i propose something like this. a way to make the client signature worth it's name. forcing him to commit the resources needed for the deal before hand - on chain.

as soon as the start epoch of the deal passes and the deal is not published by a miner these locked funds, data cap could then be released again.

how exactly that looks like in the end needs to be discussed, if a client needs to just put up a deposit in case the publish fails or if he needs to deposit all the resources needed to get the deal done - no idea what would be better

f8-ptrk commented 3 years ago

Can you explain to me the logic of how a batch publish task is able to discover that a deal is bad (prior to publishing the message) and cannot simply discard the one bad deal from the batch of good deals that should be published? Obviously there is a mechanism to verify a deal prior to publish (which is why the entire batch was discarded).

i hope i am right about that. if lotus discards deals without asking prior to publishing that would be really really bad actually

f8-ptrk commented 3 years ago

@stuberman where exactly did this fail? did it send the message to the chain or not?

if not: very bad. it pretends to know the future. it should not do that. this could be send to the chain and succeed if the client gets the data cap approved in time!!!!

if yes: works as expected

stuberman commented 3 years ago

The deal failed prior to be put on chain. The entire batch of deals never made it to the chain. All of the deals were errored out.

Jun 28 03:47:52 true bafyreie33qicnkju72vz27fe4lybkxuh3xuqstschi3xq22i2zdrizxihy 0 StorageDealError f3rmy6c7zoefq3bb4thhvmv35ihe4t35hyaivoxdrzzmtt5zx6hrdp4ltgzccndggynjkc3joxxok3quv5mg2a 32GiB 0 FIL 1458616 12D3KooWQrktQtwVbMBoeoN6te6i26s1gQPVfTYVgFs1U4ePmpys-12D3KooWQvdhZSSArWTSBiyjVZa3mAmzoLHXV2QS9nFoaGLv3fL7-1624609546043369505 error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17)

Jun 28 08:02:58 true bafyreif4s5htqde4j2rixmzmsk6bn3mseuwd62wyrru2y4v42lkj4orxtu 0 StorageDealError f3rmy6c7zoefq3bb4thhvmv35ihe4t35hyaivoxdrzzmtt5zx6hrdp4ltgzccndggynjkc3joxxok3quv5mg2a 32GiB 0 FIL 1458106 12D3KooWQrktQtwVbMBoeoN6te6i26s1gQPVfTYVgFs1U4ePmpys-12D3KooWQvdhZSSArWTSBiyjVZa3mAmzoLHXV2QS9nFoaGLv3fL7-1624609546043369515 error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17)

Jun 28 09:01:48 true bafyreiewg5yn4id76xry36sy33o2hljhlcewudv6hbwk6plsozu2pug42e 0 StorageDealError f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q 16GiB 0 FIL 1063427 12D3KooWHRm8wKqXdBKDS4QkJmdUPNTqRSFRp7ZHg8MphRWbZXCs-12D3KooWQvdhZSSArWTSBiyjVZa3mAmzoLHXV2QS9nFoaGLv3fL7-1624789013145171951 error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17)

Jun 28 12:24:06 true bafyreicvrblu3js7egclarj7xefb2bi2dnil4rixghkuq645nswtwz4ope 0 StorageDealError f3rmy6c7zoefq3bb4thhvmv35ihe4t35hyaivoxdrzzmtt5zx6hrdp4ltgzccndggynjkc3joxxok3quv5mg2a 32GiB 0 FIL 1457584 12D3KooWQrktQtwVbMBoeoN6te6i26s1gQPVfTYVgFs1U4ePmpys-12D3KooWQvdhZSSArWTSBiyjVZa3mAmzoLHXV2QS9nFoaGLv3fL7-1624609546043369525 error calling node: publishing deal: GasEstimateMessageGas error: estimating gas used: message execution failed: exit 17, reason: failed to add verified deal for client: f1f66jgglnxcdtdkfgppqt4r72x6oifrkcfnrec5q (RetCode=17)

jennijuju commented 2 years ago

Transferred this from lotus - as the deal validations and processing need to be handled in actor as well for the message doesnt fail. @ZenGround0 I think this is a good candidate for QoL milestone (should get resolved with #1375

ZenGround0 commented 2 years ago

I'm late to the party but here are a couple thoughts. This is a good discussion so I'm keeping this open even though it is mostly covered by other issues (#1375). @f8-ptrk here is a relevant issue to the problem you are discussing: #1144. If you have further ideas on how to prevent client "griefing" I, and probably the community as a whole, would welcome hearing about them in more details. Currently I am under the impression that most PSD failures come from mistakes and not malicious behavior so I am going to focus on your original proposal @stuberman and write up a small design for improving error handling of batches. To do this correctly I am planning to include information about the successful deals in the return value of PSD. This way which deals failed and which will succeed will be unambiguous.