filecoin-shipyard / js-lotus-client

Lotus JS Client
https://filecoin-shipyard.github.io/js-lotus-client/
Other
55 stars 12 forks source link

Investigate deal failure: clientMarketBalance.Available too small #29

Open mishmosh opened 4 years ago

mishmosh commented 4 years ago

The t01003 miner logged: providerstates providerstates/provider_states.go:286 deal bafyreigtq2gxj7h6ijizd5ra2ku2nyiwvgja6g4dxa3psxz67g7h3xdneu failed: deal rejected: clientMarketBalance.Available too small

Including introducing wrappers for debugging if useful Spend no more than 2 hours

jimpick commented 4 years ago

Screencast showing failed deal: https://ipfs.io/ipfs/QmcqcrTKqmnkHeE9SM3zkXHC82s7F78PtZYpTSq5Ws97i5/js-lotus-client-deal.mp4

jimpick commented 4 years ago

Source code for prototype web frontend here: https://github.com/jimpick/lotus-api-playground/tree/6a9136829b1c7d0efccd289836980f5b30dc8dcc/deal-prototype

whyrusleeping commented 4 years ago

@jimpick could we see the outputs of StateMarketBalance for each of the parties involved in the UI? thats the balance thats being too small here I think

jimpick commented 4 years ago

@whyrusleeping I could do that.

However, I rebuilt and used lotus state get-actor <wallet-address> on the miner node before I tried to make a deal, and it reported the balance. But then the deal failed... it's reproducible, so I'm going to add some extra logging to narrow it down further.

jimpick commented 4 years ago

@whyrusleeping Ah, that CLI command doesn't use StateMarketBalance ... I'll try doing it via the API from the web frontend as you suggested.

jimpick commented 4 years ago

Okay, did it.

Video here: https://gateway.ipfs.io/ipfs/QmU72d3Nt6S8J1EuFrKXZwdFmpvxpNyCM9QhWh8X6N6DbL/lotus-deal-fail.mp4

At 5s, I clicked to propose a deal:

Screenshot 2020-04-09 11 17 53

At 14s, the escrow funds still haven't shown up at t01003:

Screenshot 2020-04-09 11 19 29

I switch over to my console check the status ... at 17s, it has failed:

Screenshot 2020-04-09 11 20 57

Immediately switching back to the web app at 19s, the escrow funds are now there:

Screenshot 2020-04-09 11 22 19

So it looks to me like there's a slight race going on... if it waited another second or so, the funds would have been there. @whyrusleeping @hannahhoward

whyrusleeping commented 4 years ago

@jimpick ah great, so it sounds like we need to wait for a message to land in the chain before proceeding somewhere.

whyrusleeping commented 4 years ago

Ah, I see. This is the client funds not making it on to the chain before the deal gets sent to the miner

jimpick commented 4 years ago

I'm using a temporary work around this particular problem by adding a 1 second delay:

diff --git a/storagemarket/impl/providerstates/provider_states.go b/storagemarket/impl/providerstates/provider_states.go
index 9384cd5..414d46a 100644
--- a/storagemarket/impl/providerstates/provider_states.go
+++ b/storagemarket/impl/providerstates/provider_states.go
@@ -2,6 +2,7 @@ package providerstates

 import (
        "context"
+       "time"

        "github.com/filecoin-project/go-address"
        datatransfer "github.com/filecoin-project/go-data-transfer"
@@ -82,10 +83,16 @@ func ValidateDealProposal(ctx fsm.Context, environment ProviderDealEnvironment,
        }

        // check market funds
+       log.Infof("Jim check market funds Client %v", deal.Proposal.Client)
+       log.Infof("Jim sleep 1s")
+       time.Sleep(1 * time.Second)
        clientMarketBalance, err := environment.Node().GetBalance(ctx.Context(), deal.Proposal.Client)
        if err != nil {
                return ctx.Trigger(storagemarket.ProviderEventNodeErrored, xerrors.Errorf("getting client market balance failed: %w", err))
        }
+       log.Infof("Jim check market funds clientMarketBalance %v", clientMarketBalance)
+       log.Infof("Jim check market funds clientMarketBalance Available %v", clientMarketBalance.Available)
+       log.Infof("Jim check market funds deal.Proposal.TotalStorageFee %v", deal.Proposal.TotalStorageFee())

        // This doesn't guarantee that the client won't withdraw / lock those funds
        // but it's a decent first filter

But now I'm encountering another problem which is preventing me from reliably completing deals ... I'll create another issue for that.

jimpick commented 4 years ago

Slack thread here: https://filecoinproject.slack.com/archives/CP50PPW2X/p1586456736412400

@whyrusleeping @hannahhoward Should I open an issue for this in lotus or go-fil-markets?

jimpick commented 4 years ago

I tried building testnet/3 today without the 1s delay workaround, and I haven't seen the problem yet.

0.3.0+git9eecd9d1.dirty+api0.2.0
lotus version 0.3.0+git9eecd9d1.dirty
hannahhoward commented 4 years ago

@jimpick

So, we’ve been working this issue —

  1. It is tricky — we’re waiting for funds to be added on the storage client, and then sending the deal to the provider, but it’s a different node, that could be in a different state of syncing the chain, or just not have gotten the relevant block message, where the funds haven’t shown up yet. That’s why it’s intermittent, we think.
  2. The likely solution is to wait for a certain confidence level (i.e. wait for more blocks to appear past the message adding funds) on the client. We've now mostly completed this work, which required some API changes
  3. That work is not in the released branch for testnet-2 and and won't make it in without buy in from higher level stakeholders, who've requested code freeze
  4. So we are working on it, but it is still on going.
jimpick commented 4 years ago

@hannahhoward Thanks for the update!

jimpick commented 4 years ago

I'm still seeing this ... even this morning, on interopnet.