stellar / go

Stellar's public monorepo of go code
https://stellar.org/developers
Apache License 2.0
1.3k stars 499 forks source link

Feature Request: Allow Efficient Access to Account "Actions" Where Account is Source #4051

Open mootz12 opened 2 years ago

mootz12 commented 2 years ago

What problem does your feature solve?

Ability to find account actions, where "account action" is defined as any modification to the ledger that the given account was the source of.

Currently, Horizon only allows us to find actions in which a given account was involved in. IE - it includes actions that affected it.

This allows for a few things:

  1. Removes a risk / attack vector (detailed below) on any account where transaction/operation history is relevant
  2. Allows the ability to quickly find all actions an account has committed, lowering query load against Horizon

Potential Solutions:

  1. Add ability to filter Operations by the source_account of the operations.

Attack Vector:

Consider a payroll app that has a specific account (Account 1) that does payroll for a company, and they store all information on chain to remove as much trust as possible. Users can run the app when they want to get paid, and it will pay them their salary for the time that has passed since they were last paid.

An attacker (say from the apps competition), could commit large amounts of 1 stroop payments to Account 1. When an employee tries to claim their pay next, the payroll app will have to wade through a TON of transactions / operations to find the last time Account 1 paid the employee, and worst case, might not be able to determine when the event occurred.

What would you like to see?

As new avenues open up for Stellar based applications to become more decentralized (like Stellar Turrets), allowing applications to safely find historical account actions will be vital to ensure these apps can be built securely and efficiently.

The method by which we find what actions an account caused isn't particularly important. We came up with one potential avenue that seemed like it would solve our use case:

Potential Solution In Detail

The solution would be to add a new DB query and endpoint where you could get Operations where the source_account was a given account. This likely would require source_account to be an indexed field.

Endpoint: /accounts/{account_id}/operations_as_source

// ForSourceAccount filters the operations collection to a specific source account
func (q *TransactionsQ) ForSourceAccount(ctx context.Context, aid string) *OperationsQ {
        // 1 - basic
    q.sql = q.sql.
        Where("hop.source_account = ?", aid)

        // 2 - include '' source_account operations
        q.sql = q.sql.Join(
        "history_transactions ht ON "+
            "ht.id = hop.transaction_id",
    ).Where("hop.source_account = '' AND ht.account = ?", aid)

        // combine these results somehow
    return q
}

To my knowledge, any operations where the source_account field was not explicitly set defaults to '' in the history_operations table.

Would there be a possibility to efficiently find operation source accounts such that any operation without an explicit source account defaults to the one found in the transaction? Query number 2 on first glance might not be performant.

Per our requirements, it is acceptable to ignore operations where source_account has default value ''. An application can easily specify the source_account in the event they want to find a specific operation. However, it would be a nice to have in the event someone wants to look for a user created operation in the future.

What alternatives are there?

It appears there are a few other options we could take to be able to find account actions:

  1. Filter Transactions by if and only if the account is a source account on an operation within the transaction, or on the TX itself
  2. Filter effects by if the owner of the account caused the effect (this might be near impossible)
leighmcculloch commented 2 years ago

Including the fee account in the filter would also be required to get a complete view of what transactions an account caused because the Horizon API bundles fee bump transactions in the same endpoint and an account can be the fee account of a fee bump transaction without being a source account of the inner transaction or any operation of the inner transaction.

2opremio commented 2 years ago

This is what I understood from the description. You want a query which takes an account as a parameter and returns the operations which were caused by that account.

I define an operation (O) in Transaction (T) to be caused by an account (A) if and only if:

  1. T was successfully included into a ledger by Core and ...
  2. O.sourceAccount == A (the source field of the operation is explicitly set to the account A) or ...
  3. O.sourceAccount == nil && T.sourceAccount == A (the source field of the operation is unset, but the source of the transaction where the operation lives is set to the account A) or ...
  4. T2 == feebump(T) ) where T2.feeSource == A (The transaction was included in the ledger as the inner transaction of a feebump transaction and the source field of the feebump transaction is the account A)

Note that, as per definition above, an operation can be caused by multiple accounts.

@mootz12 Let me know if that's what you mean.

mootz12 commented 2 years ago

Per @2opremio ,

You are correct. Our ultimate goal is to be able to efficiently query the ledger for things caused by an account.

Your definition of an operation being caused by an account also appears to align.

Points 0 and 1 are mandatory for Stellar Turrets to be able to make effective use of an endpoint like this.

Point 2 and 3 are a nice to have. Definitely makes the picture clearer / more consistent, and will likely be more flexible in the future.

Note that, as per definition above, an operation can be caused by multiple accounts.

Could you touch on how this will occur? My guess is that this is a case such that an operation (O1, without a source account) is in a transaction (T1) that is the inner transaction of a fee bump transaction (T2). As such, both the source account of T1 and T2 can be considered to have caused O1?

CC @tyvdh

2opremio commented 2 years ago

Could you touch on how this will occur?

The fee source of T2 can be different from the other sources. More formally. It can happen in these two cases:

i) O.sourceAccount == A && T2.feeSource == B where A != B ii) O.sourceAccount ==nil && T.sourceAccount == A && T2.feeSource == B where A != B

In (i) and (ii) both A and B cause O

2opremio commented 2 years ago

Please double check that https://github.com/stellar/go/issues/4051#issuecomment-973235459 is what you need. Then I will move on to work on a design.

In this case I suspect it won't be as easy as https://github.com/stellar/go/pull/4035

For starters, I just removed a couple of (until now) unused indexes in https://github.com/stellar/go/pull/4085 which I think we would need for the requirements at https://github.com/stellar/go/issues/4051#issuecomment-973235459. Unfortunately, indexes in the history_transactions table can considerably slow down ingestion so we need to think very carefully about his (see https://github.com/stellar/go/pull/4087 for an example)

leighmcculloch commented 2 years ago

Note that, as per definition above, an operation can be caused by multiple accounts.

As far as I'm aware all operations can only be caused by one account, its source account, or if its source account is nil then the transaction's source account. That's because an operation only ever required the signature from one account, the source account.

A transaction on the other hand can be caused by multiple accounts since multiple accounts may have to sign for a transaction.

When I suggested transaction fee bumps should be included I should have been more clear that I was responding to the general request to find account actions, and not the potential solution suggested in the description to add a filter on operations.

In (i) and (ii) both A and B cause O

I think in those cases only A caused O. However, both A and B caused T.


If the scope of this is to only implement the filter on operations, I think we could filter only on the operations source account, and fall back to the transactions source account when the operations is nil. If the scope of this is greater and we add a filter on transactions I think we probably should filter on transaction source, and fee source as well.

@mootz12 Does that sound correct to you? And do you only need to filter the operation endpoint?

@2opremio Maybe this simplifies the solution?

kalepail commented 2 years ago

If the scope of this is to only implement the filter on operations, I think we could filter only on the operations source account, and fall back to the transactions source account when the operations is nil.

This should be the scope in my opinion. I don't care about transactions, standard, fee bump or otherwise, I care about operations and operations filtered by the account that caused it.

Is still think we should have a transaction sequence and fee filter but that's a separate request. Being able to query for transactions an account paid the fee and/or sequence for is a very important filter but for different reasons. For this ticket we should focus on an operational filter only.

I would like to know tactically where this feature will manifest itself in the API. Is this just a filter on /operations and/or /accounts/{account}/operations? or will this also surface in places like /payments?

mootz12 commented 2 years ago

If the scope of this is to only implement the filter on operations, I think we could filter only on the operations source account, and fall back to the transactions source account when the operations is nil.

I second this.

I would also be curious to know how much this would simplify things, or if the main complication arises from the fall back to the transactions source account when the operations is nil case

2opremio commented 2 years ago

The main complication arises from the lack of indexes.

Good news is all the required data exists in the DB schema. We already have columns offering the data we need:

  1. history_transaction.account contains the transaction's XDR sourceAccount
  2. history_transaction.fee_account contains the transaction's XDR feeSource
  3. history_operations.source_account contains the operation's XDR sourceAccount

this means that the data is already there, without requiring a reingestion.

Bad news is that, for the query to perform reasonably, we would need the columns above to be indexed, which isn't the case (note https://github.com/stellar/go/pull/4085 ).

Since indexes can significantly slow down ingestion, we need to be very selective about adding them. We would need to confirm with some testing, but I would expect a sizeable negative performance impact on ingestion (which we are desperately trying to optimize due to the current ledger growth trend).

2opremio commented 2 years ago

I would like to know tactically where this feature will manifest itself in the API. Is this just a filter on /operations and/or /accounts/{account}/operations? or will this also surface in places like /payments?

Similarly to what I proposed at https://github.com/stellar/go/pull/4035 I would add an is_cause query parameter to all the operation endpoints (e.g. GET /accounts/{account_id}/operations?is_cause=true ).

Obviously, that's totally up for discussion. However, I don't think it's worth jumping into figuring out UX details until we decide what we want to go forward with.

mootz12 commented 2 years ago

Bad news is that, for the query to perform reasonably, we would need the columns above to be indexed, which isn't the case (note #4085 ).

Took a look at the issues you linked. Yeah, adding three separate string indexes for this doesn't seem like an ideal approach, especially if measures were taken against the transaction_hash index to speed things up.

Is there a particular index that might not have a large effect on the ingestion speed? I think we could get a workable solution with only the history_operations.source_account index.


If indexes are too large a problem, we could consider allowing Horizon to manage the "filtering" portion of the request. The endpoint could look the same, but the query could work as follows:

  1. Perform the current operations for account by participation query -> https://github.com/stellar/go/blob/master/services/horizon/internal/db2/history/operation.go#L137-L154
  2. Allow an option to filter the operations on the horizon client (either through a more complicated SQL query or by filtering the list afterwards) such that source_account = accountId 2a. Potential for checking history_transactions table as talked about earlier
  3. Return list

The main benefit here is all the operations such that source_account != accountId don't get streamed over the internet. Might not be the most performant option, but could close the potential attack vector.

leighmcculloch commented 2 years ago

If the scope is to filter on operations only, then I don't think there's a need to filter on fee account. We just need to filter on the operation source account and subsequently transaction source account if there is no operation source account. There is no longer a distinction of "cause" vs "source" which arises when we're talking about transactions. So I think we'd only need two extra indexes, not three:

  1. history_transaction.account contains the transaction's XDR sourceAccount
  2. ~history_transaction.fee_account contains the transaction's XDR feeSource~
  3. history_operations.source_account contains the operation's XDR sourceAccount