XRPLF / rippled

Decentralized cryptocurrency blockchain daemon implementing the XRP Ledger protocol in C++
https://xrpl.org
ISC License
4.49k stars 1.45k forks source link

Get `tefALREADY` even with fail_hard = true (Version: s1.ripple.com:51234) #3742

Open anhcao142 opened 3 years ago

anhcao142 commented 3 years ago

Issue Description

Get tefALREADY error when trying to submit tx with fail_hard option

Steps to Reproduce

This happens at random, unknown what triggers the behavior

Expected Result

With the fail_hard option turn on I would expect either tx success or any error relates to fee or tx not correct

Environment

Public node: s1.ripple.com:51234

Supporting Files

Here my response object looks like:

{
  "accepted": false,
  "account_sequence_available": 59597426,
  "account_sequence_next": 59597426,
  "applied": false,
  "broadcast": false,
  "engine_result": "tefALREADY",
  "engine_result_code": -198,
  "engine_result_message": "The exact transaction was already in this ledger.",
  "kept": false,
  "open_ledger_cost": "10",
  "queued": false,
  "status": "success",
  "tx_blob": "[removed]",
  "tx_json": {
    "Account": "[removed]",
    "Amount": "[removed]",
    "Destination": "[removed]",
    "DestinationTag": 105522285,
    "Fee": "10000",
    "Flags": 2147483648,
    "Sequence": 59597425,
    "SigningPubKey": "[removed]",
    "TransactionType": "Payment",
    "TxnSignature": "[removed]",
    "hash": "[removed]"
  },
  "validated_ledger_index": 61071731
}
ximinez commented 3 years ago

I'm not quite sure what the issue is here. As described in the response object, tefALREADY indicates that "the exact transaction was already in this ledger." That indicates that you have submitted the exact same transaction twice in close succession. The only other possible response in that case would have been tefPAST_SEQ.

fail_hard does not change the potential results in any way. The only thing it does is prevent the receiving rippled server from retrying that particular transaction later, if it's retryable. Transactions that get tef error codes are generally not going to make it into a ledger, regardless of the value of fail_hard. That said, tefALREADY is a tiny bit different in that the transaction is likely to succeed, but the most recent submission is a duplicate that isn't going to be re-processed.

You indicated that you "would expect either tx success or any error relates to fee or tx not correct". tefALREADY is precisely an error indicating that the tx submission is not correct - not because of the tx itself, but because it's a duplicate.

I'm going to close this issue, as it looks like everything is working as intended. If you have anything to add, please feel free to re-open this issue, or open a new issue.

References:

anhcao142 commented 3 years ago

Hi, I understand the error's meaning, my point is, this is the first time the transaction is broadcasted. There is no way the tx is broadcasted twice, otherwise, this would have happened a lot to my other txs. I suspect that the node is doing some retry itself even with the fail_hard flag on.

intelliot commented 3 years ago

@anhcao142 Can you share some sample code that reproduces this error?

(Unless there's a bug, I don't think you can get this error if it's the first time that the transaction has ever been submitted.)

anhcao142 commented 3 years ago

I don't know how to reproduce the error but this is the source code that I use. I submit the signed transaction using JSON_RPC, the broadcast function is called only once, it either passes or throws err. Inside broadcast, if a tx is submitted twice, I will know by checking the log.

const Promise = require('bluebird');
const jayson = require('jayson');
const rpc = Promise.promisifyAll(jayson.client.https('https://s1.ripple.com:51234/'));

async function submit(txBlob) {
  const params = [{ tx_blob: txBlob, fail_hard: true }];
  return (await rpc.requestAsync('submit', params)).result;
}

async function broadcast(signedTransaction) {
  const maxAttempts = 5;
  let attempts = 0;
  while (attempts < maxAttempts) {
    const result = await submit(signedTransaction);
    if (result.engine_result === 'tesSUCCESS') break;

    if (result.engine_result === 'terQUEUED') {
      await waitQueuedTx(result.tx_json.hash); // this will wait until tx is validated or after 20s
      break;
    }

    if (
      // any other errors
      !result.engine_result.startsWith('ter') ||
      // or we try our best, even result startsWith 'ter'
      attempts === maxAttempts - 1
    ) {
      throw Error(JSON.stringify(result));
    }

    attempts += 1;
    await Promise.delay(1000);
    console.log(`Broadcasting fail with result:\n${JSON.stringify(result)}`);
  }
}
ximinez commented 3 years ago

The thing that jumps out at me is that broadcast has a loop, which can submit the transaction more than once if a ter result comes back. Do you see that log "Broadcasting fail" message before you see these tefALREADY messages? If you do, that might indicate an issue in the fail_hard code.

The thing that doesn't jump out at me is that you didn't include the code that creates and signs the transactions. It is possible, however unlikely, that you're creating and signing the same transaction and submitting it on separate calls to broadcast. That could be a bug in your code, or it could be because s1 is actually a cluster of nodes with different IP addresses. I am not familiar with jayson.client, so I don't know how it handles DNS, but if it looks up the IP for every request, this could happen:

  1. You build a transaction with sequence number X.
  2. You submit the transaction to s1.ripple.com, the DNS query gives you IP 1.
  3. The transaction is successful.
  4. You query s1.ripple.com to get the account's sequence number, but this time DNS gives you IP 2, which hasn't seen the transaction with X yet, so it returns X.
  5. You build a transaction with sequence number X, that happens to be identical to the transaction you built in step 1.
  6. You submit that transaction to s1.ripple.com. This time, the DNS gives you an IP that has seen and processed X. That server then correctly returns tefALREADY.

Additionally, if you specify a ledger_index other than "current" when you're looking up the account sequence number, you may be getting that X result because your transaction X has been accepted, but not yet validated.

s1.ripple.com is a public resource with no guarantees, and while it's fine to use it for development and one-offs, Ripple strongly recommends that you run your own rippled server for production purposes. If you run a single node, and always submit to that node, and this problem stops, then your issue could be explained by DNS.

Finally, I'll include my standard reminder that the result returned by submit is provisional, and it is possible for a transaction that gets a tesSUCCESS result to later fail to be applied to a validated ledger. Relying on those provisional results, as you're doing in broadcast is not usually a best practice. More info is at https://xrpl.org/reliable-transaction-submission.html#transaction-timeline

anhcao142 commented 3 years ago

Thanks for the thorough analysis, let me clear some of your questions. For this transaction, I have checked the log and there is no log before the tefALREADY error. That's why I find it strange to have this error.

My application doesn't use the sequence from the API to build the transaction (except the first tx), after broadcasting one or more tx, I cached the next sequence for future tx. After broadcasting successfully, I store the transactionHash so that later it will be confirmed by another application that monitors the network. So submit result is not my final result, I just use it to make sure the tx is successfully submitted to the chain, the confirm part is handled by another application.

Maybe the problem lies in using the public node, I'll consider hosting a rippled server.

ximinez commented 3 years ago

Thanks for the explanation! I agree that hosting a rippled server is a good next step, because otherwise I'm stumped, at least for the moment.

ximinez commented 3 years ago

I do have one more question. Do the transactions that come back with tefALREADY eventually get validated anyway?

anhcao142 commented 3 years ago

Yes, it has been validated by the network.

ximinez commented 3 years ago

That's good, I think, because even if there is a bug, it's at least not a transaction-dropping bug. It's a good thing that you're looking for the validated transaction after submission.