leather-io / extension

Leather browser extension
https://leather.io
MIT License
295 stars 141 forks source link

Error getting fee estimation while trying to deploy a smart contract #4368

Open andresgalante opened 11 months ago

andresgalante commented 11 months ago

We are seeing an error both on the platform and explorer sandbox while trying to deploy a smart contract. We suspect that it's erroring while getting the fee.

image
kyranjamie commented 11 months ago

Hi @andresgalante 👋🏼

Haven't touched the Stacks contact code in a while, and it's been forgotten post-rebrand, I'm seeing many UI bugs.

Which contract in particular are you calling, and are you able to share the broadcast response error?

314159265359879 commented 11 months ago

I think this particular error may be related to the network connection issue. (duplicate of?) https://github.com/leather-wallet/extension/issues/4326

kyranjamie commented 11 months ago

Running through this with @andresgalante we found that the contract deploy transaction fails with a 400, but doesn't return any error response payload.

rafaelcr commented 11 months ago

@kyranjamie from the API logs, the 400 error response is:

type":"InvalidRequestError","message":"Hex string is an odd number of digits: 0xundefined","stack":"Error: Hex string is an odd number of digits: 0xundefined\n

I assume the wallet is building the stacks API URL like this: api.hiro.so/extended/v1/tx/${txId}. Could there be a wallet bug where txId is undefined when building that URL?

kyranjamie commented 11 months ago

It could well be that the wallet is incorrectly trying to look up an undefined txid.

I don't think that's related to this broadcasting issue, though. The wallet successfully POSTs a transaction, and gets a 400. @andresgalante and I couldn't figure out what, if anything, could be wrong with this tx, as the API doesn't return an error object (as it does for other poorly constructed transactions)

diwakergupta commented 11 months ago

The wallet successfully POSTs a transaction, and gets a 400.

I think this is the key distinction: is the 400 coming from a POST of the transaction (which ends up being served by a stacks-node RPC endpoint) vs. the /extended/v1/tx endpoint, which is served by the API itself. So which is it?

rafaelcr commented 11 months ago

@diwakergupta the 400 is coming from the API's api.hiro.so/extended/v1/tx/:tx_id endpoint. I assume the wallet is looking to check the status of the transaction

kyranjamie commented 11 months ago

It comes from the POST request when broadcasting the contract deploy tx to /v2/transactions, and it seems this error is because we try and parse the error body (but there's no response).

Perhaps there's a separate issue with the other endpoint, but that seems unrelated to this bug.

Some guess work here, as I haven't ever replicated the issue, only seen when debugging with @andresgalante

diwakergupta commented 11 months ago

It comes from the POST request when broadcasting the contract deploy tx to /v2/transactions, and it seems this error is because we try and parse the error body (but there's no response).

That's what I figured. So, per the docs at least, a 400 response should always have a valid JSON payload with details on the transaction was rejected: https://github.com/stacks-network/stacks-blockchain/blob/master/docs/rpc-endpoints.md?plain=1#L3

If you believe there's a bug on the blockchain side, feel free to open an issue. Either way though, I think callers should be resilient to misbehaving API endpoints -- in this case the wallet should better handle a 400 error with no response body.

kyranjamie commented 11 months ago

Error handling in the wallet right now needs a lot of work, there are a handful of issues to improve it.

Specifically in this case though, what can we do better to improve? Catch a no error response case and show a custom msg "unknown error" msg instead of unexpected end of json? Functionally it's working as expected I think. We catch the error response, show a failed broadcast msg, and allow user to retry.

That's not to say this isn't an issue the wallet forming the tx incorrectly, just that I can't ascertain so without recreating myself, or an appropriate error msg.

beguene commented 11 months ago

This is the payload that the user is sending to the api via the leather wallet. The request to /transactions returns a 400 The user got an error trying to deploy via the explorer sandbox and the wallet.

screenshot_2023-10-20_at_15 56 53

Looking at the beginning of the payload, it seems like it's an invalid format. And maybe that's why the api returns 400. If that's the case it could be a transmission error of the transaction between the dApp and the Stacks API. Both explorer and platform use openContractDeploy from @stacks/connect.

The user tried with a different chrome profile, a different internet provider/computer. The user also tried to deploy with the explorer earlier today and it was successful ( and the explorer code has not changed since then)

janniks commented 11 months ago

UPDATE: I have tried to debug this with a few intercepted debug calls from folks running into the issue. It always looks like the network request is NOT returning a body (on the broadcast call). The issue why a tx might fail can vary (and I have observed: ConflictingNonce, AlreadyInMempool, SignatureValidation, and even correct transactions), but this is masked and the reason can't be shown.

I have roughly gone over recent commits to Stacks.js and Leather wallet and don't see any glaring issues (or even relevant code parts changed for a while). Although this isn't perfect so I'll spend some more time searching.

I'm still very uncertain why this might happen. We have had no success in reproducing 100% of the time, but sometimes can observe it (with limited debuggability).

My gut feeling thinks it's something to do with Chrome networking? But I'm happy for any insights and ideas! ✨

janniks commented 11 months ago

Looking at the beginning of the payload, it seems like it's an invalid format.

This is just what the raw serialized binary format (wire format) looks like in unicode. Here's an example of a successful transfer:

Screenshot 2023-10-24 at 19 16 46
BowTiedDeployer commented 11 months ago

Hi. Got this error as well on the wallet. The network request showed the post request as 400 Bad Request. image

Here is the payload as well. image

The transaction was valid because it went through after copying the request (right-click the request in chrome devtools and copy as curl) and submitting it that way. Also, thanks @janniks for indicating the steps and broadcasting the curl.

janniks commented 11 months ago

Not sure how to proceed here. @kyranjamie any other ideas? Should we pair-investigate on this some time?

kyranjamie commented 11 months ago

@janniks let's jump on a call tomorrow with @markmhendrickson, to arrange directly

mefrem commented 11 months ago

Hi team,

Including a few details, which may be redundant or already accounted for, in case they're helpful. 1) I got the "transaction failed to broadcast" because of "unexpected end of json input" error when attempting to deploy this simple contract from the Hiro Platform, but was able to successfully broadcast the same contract directly from Clarinet.

(define-read-only (singing) (ok "It's a whole new world") )

2) It looks like it's unrelated (and I can make a separate issue for it), but I was also seeing extremely high gas estimations on the same contract deployment when I run clarinet deployments generate --testnet --low-cost with a fee of something like 8k STX. I manually set it lower and deployed successfully.

janniks commented 11 months ago

We've done a lot of debugging and the issue is very flaky (even though some people seem to get it more often than others).

The error happens on different underlying "errors" and is sort-of masking the real problem. The unexpected end of json occurs since we assume we're getting a json body from the v2 request, but we get a 400 without a response body.

We've explored all avenues that come to mind on encoding/tx/contract/stacks.js -- all out of ideas. Since the HTTP response is empty we are currently thinking there might be an issue somewhere else.

--

Next steps: Get a verbose logging install of the extension (cc @kyranjamie) and debug with devops when we are able to reproduce.

kyranjamie commented 11 months ago

@rafaelcr @zone117x @CharlieC3 can you confirm whether an empty 400 response from POST /v2/transactions is an expected behaviour, or not?

Here's a debug build that logs responses https://github.com/leather-wallet/extension/pull/4468

diwakergupta commented 11 months ago

@rafaelcr @zone117x @CharlieC3 can you confirm whether an empty 400 response from POST /v2/transactions is an expected behaviour, or not?

Here's a debug build that logs responses #4468

FYI the /v2/transactions endpoint is exposed by the stacks-node itself, the API simply proxies it. Based on these docs, I would say a 400 response without an error body almost certainly sounds like a bug -- I'd recommend opening an issue in the stacks-blockchain repo. And to debug this, we'd need to have debug logs from the stacks-node that was serving the POST.

CharlieC3 commented 11 months ago

@diwakergupta We have debug logging turned off for our node pools by default since they can be quite excessive. I'll turn debug logging on temporarily to see what we can capture.

CharlieC3 commented 11 months ago

The debug logs don't show any additional information unfortunately.

[
  {
    "timestamp": "1698865845037516081",
    "fields": {
      "container_name": "stacks-blockchain",
      "file": "src/net/http.rs",
      "level": "info",
      "line": "1637",
      "message": "Handle HTTPRequest",
      "path": "/v2/transactions",
      "query": "",
      "thread": "p2p-(0.0.0.0:20444,0.0.0.0:20443)",
      "ts": "2023-11-01T19:10:45.037428653Z",
      "verb": "POST"
    }
  },
  {
    "timestamp": "1698865845039129769",
    "fields": {
      "conn_id": "710",
      "container_name": "stacks-blockchain",
      "file": "src/net/rpc.rs",
      "level": "debug",
      "line": "3271",
      "message": "Processed HTTPRequest",
      "path": "/v2/transactions",
      "processing_time_ms": "1",
      "thread": "p2p-(0.0.0.0:20444,0.0.0.0:20443)",
      "ts": "2023-11-01T19:10:45.039049889Z"
    }
  },
  {
    "timestamp": "1698865845994734876",
    "fields": {
      "authority": "stacks-node-api.mainnet.stacks.co",
      "bytes_received": "171",
      "bytes_sent": "242",
      "container_name": "istio-proxy",
      "duration": "2",
      "level": "info",
      "method": "POST",
      "path": "/v2/transactions",
      "protocol": "HTTP/1.1",
      "request_id": "8c589055-508e-4927-928e-0700ef2f5d60",
      "response_code": "400",
      "timestamp_end": "2023-11-01T19:10:45.994734876Z"
    }
  }
]
janniks commented 11 months ago

Thanks for confirming these 🙏🏻 Is it correct to say that we’re pretty certain at this point that the request itself looks correct/valid and makes it all the way to the node? And thus we likely have a bug on the RPC broadcast endpoint, which also swallows the actual cause (if the tx is already incorrect)?

CharlieC3 commented 11 months ago

Is it correct to say that we’re pretty certain at this point that the request itself looks correct/valid and makes it all the way to the node

Not sure if anyone has inspected the request from the wallet debug build, is this something you or @kyranjamie can help out with? Requests do seem to be getting routed to the stacks nodes.

And thus we likely have a bug on the RPC broadcast endpoint, which also swallows the actual cause (if the tx is already incorrect)?

Hard to tell without sufficient logging. We would need better logging for this endpoint to dig further.

I would suggest first inspecting the request of a failed broadcast in the debug build of the wallet. If nothing is wrong with the request, we can ask someone from the blockchain team to help with the log change in the node for further testing. (@andresgalante we might need your help on resource allocation)

markmhendrickson commented 10 months ago

It appears @kyranjamie has created such a debug build here: https://github.com/leather-wallet/extension/pull/4468

sabbyanandan commented 9 months ago

I am reviving this thread because I recently ran into the same error.

Regardless of the transaction signing from the Hiro Platform, Explorer Sandbox, or a simple transfer between different accounts in the Wallet, I run into the same error. And that is:

Unable to broadcast transaction Your transaction failed to broadcast because of the error: failed to broadcast transaction: unexpected end of json input

Here are a few more examples of random gunk that I notice getting added to the beginning of the JSON payload:

Example 1:

image

Example 2:

image

And I tried signing transactions from Xverse, and in that case, the deployment was successful. So something to dial-in on the handshake with Leather.

kyranjamie commented 9 months ago

Thanks Sabby. Can you share the exact steps to replicate this?

@janniks and I looked into this in the past, and @andresgalante showed me it happening live, but I was never able to replicate myself.

RE your examples afaik, this is just the UTF-8 encoding of the contract deploy transaction. The jumbled text being transaction metadata, and the rest being the Clarity code.

sabbyanandan commented 9 months ago

Alright, I think I can reproduce with the following steps.

The happy-path that works:

I think having an active Wallet session inside the Platform, and then performing any action against Testnet produces the error.

kyranjamie commented 9 months ago

Same situation again, I'm afraid.

image

Demo contract calls is made successfully, with txid returned 👇🏼

image

I've tried this on my development Chrome wallet, and production install on Opera.

Do you also get a 400 error code @sabbyanandan? We've had reports of 403s from some users.

sabbyanandan commented 9 months ago

Do you also get a 400 error code @sabbyanandan? We've had https://github.com/leather-wallet/extension/issues/4675 from some users.

Correct, it is always a 400 for me. See:

image
kyranjamie commented 9 months ago

@sabbyanandan and I noticed some strange behaviour where, when signed in to the platform, we could consistently get the sandbox to 400. Upon signing out, the sandbox contract deploy succeeds. Only on ending the platform session does the transaction broadcast.

Even with the platform signed in, but the tab closed, it 400s (attempting to rule out background requests being the culprit).

@beguene @janniks @andresgalante Are there any platform services running while a session is active that might interfere/cause the API to behave differently? Does this prompt any ideas as to what the platform might be doing? There's no way the signed in state of the platform could impact how the wallet generates the transaction (I very much hope, at least).

CharlieC3 commented 9 months ago

@kyranjamie After a debugging session yesterday with @sabbyanandan and @andresgalante, I was able to narrow down the root cause to this error from the Stacks node:

Invalid message preamble: Failed to decode HTTP request or HTTP response (request error: UnderflowError("Not enough bytes to form a HTTP request preamble"); response error: DeserializeError("Failed to parse HTTP response: Version"))

This is an error thrown by the Stacks Core node when the headers are too big. Here’s where the error is thrown: https://github.com/stacks-network/stacks-core/blob/62c7e4d4e8364f362ad843933fae0ca064b38317/src/net/http.rs#L1330 Here’s the test which validates this error: https://github.com/stacks-network/stacks-core/blob/62c7e4d4e8364f362ad843933fae0ca064b38317/src/net/http.rs#L6614

The cookie is passed in a header to hiro.so. So I created a rule in our proxy to remove the cookie header on requests before forwarding the traffic to a backend stacks node. Andy was able to confirm he's no longer seeing the issue in Leather. So I think we may be able to close this issue out unless you're still seeing this problem elsewhere.