Provide full asset information on /v2/accounts/{address}

SilentRhetoric commented 1 year ago

Status

The assets array in the response from /v2/accounts/{address} only provides id, amount, and is-frozen information about assets held in the account.

 "assets": [
    {
      "amount": 0,
      "asset-id": 0,
      "is-frozen": true
    }
  ]

This means that an app which needs additional information about assets, such as something as simple as the number of decimals for accurate amount display in a UI, must make additional algod requests per asset to get full asset information.

Expected

Include all asset parameters in the asset objects so that apps have enough information to handle the assets.

Solution

Either change the existing response or provide a similar endpoint that provides an account information response that is identical except for providing all asset parameters about the account's asset holdings.

Dependencies

No apparent dependencies.

Urgency

Currently, apps must make individual requests to algod /v2/assets/{asset-id} per asset, which can number in the hundreds for a given account that holds many assets. This increases load on node endpoints and causes apps to exceed rate limits or need to pay for increased service.

pbennett commented 1 year ago

This would be huge for most backend and frontend dApp developers. Could be added as an option to expand the params details. Independently, we also need a way of doing 'batch' fetches of ids and boxes.

bbroder-algo commented 1 year ago

I hear you. We will discuss at next issue refinement.

emg110 commented 1 year ago

This will be hugely appreciated as well as what @pbennett suggested! Both needed!

algoanne commented 1 year ago

To support this would require algod to potentially do a lot of work for every such request (the data is not organized in the DB in a way to efficiently support this query). If we were to implement it naively, at worst this has the potential to take down the algod. If multiple people are using the same algod (e.g. using an API provider node), then this becomes a serious concern.

Have you considered using Conduit to get this data instead? With Indexer and Conduit, we have endeavored to build a system so that people can change indices to be able to support the unique queries they need. Is there a reason it has to come from algod?

SilentRhetoric commented 1 year ago

To support this would require algod to potentially do a lot of work for every such request (the data is not organized in the DB in a way to efficiently support this query). If we were to implement it naively, at worst this has the potential to take down the algod. If multiple people are using the same algod (e.g. using an API provider node), then this becomes a serious concern.

Have you considered using Conduit to get this data instead? With Indexer and Conduit, we have endeavored to build a system so that people can change indices to be able to support the unique queries they need. Is there a reason it has to come from algod?

I will happily keep hammering Algod with one request per asset rather than stand up new infrastructure to support getting asset decimals.

There is no reason that it needs to come from algod other than the fact that the every front end needs decimals info for every single ASA that is to be properly displayed in a UI. It seems incongruent to need specialized indexing to handle a use case which ubiquitous across the ecosystem.

If enhancing the account info response is not a good way of providing the capability, shall I create two new issues for batch asset info and batch box requests?

algoanne commented 1 year ago

My current thought (a controversial opinion, I'm sure...) is that maybe the algod should only be used for submitting transactions and getting the state deltas for data. These are the algod's two primary purposes in my view: let me send txns, and tell me what happened.

We could start opening up customizations to the algod, if people want to use the algod for more advanced querying of current state. This scares me, of course, we usually try to keep algod functionality limited to its core to minimize risk.

And yes, I understand that your current solution is to make many API requests instead of one big request, so it's not efficient in terms of messages sent back and forth. It does allow the API provider to have some control over the usage of the algod (I think).

On the Indexer side, last year we introduced a config file that allows Indexer runners to control which endpoints and parameters they will expose. This means we would probably be open to a PR to the Indexer adding an endpoint like this, since it could be turned off/on based on the provider's setup.

Anyway, this feels like a big discussion. I wonder what other people's perspectives are. I believe on Ethereum you simply cannot get this kind of information directly from their nodes (though I'm no expert on Ethereum).

pbennett commented 1 year ago

VERY strong disagree. I think you'll be very hard pressed to find any builder that agrees with this view Anne. Indexer requirements are currently 1.6X a full archival node. This is absolutely not acceptable for the very simple and VERY common request of 'I want to fetch the assets someone has' - which every node ALWAYS HAS TO KNOW because it's part of active state. The entire 'state' is around 14GB now. As opposed to an indexer and required postgres infrastructure at over 1.6TB.

algoanne commented 1 year ago

@pbennett

Indexer no longer requires an archival node ever since Conduit release.

You remind me of this issue: https://github.com/algorand/indexer/issues/1343 which asks for basically an indexer that just queries current state. Maybe that's what's missing in our ecosystem currently, a more flexible way to query just current state (and somewhere less resource-worrisome/risky than algod). Maybe someone should write a Conduit-based offering that just stores current state (deletes the previous entries every round or something) that everyone could use.

pbennett commented 1 year ago

These are VERY different things Anne. Are you really suggesting people set up their OWN CUSTOM INDEXER (via conduit) just so they can efficiently fetch assets which ALL already HAVE to be in algod? Right now we fetch the data from multiple algod instances, local to our environment, not incurring cloud egress charges. AlgoD in fact has to, by design, be efficient at fetching/caching this data, because it has to fetch for every transaction - pre-loading from foreign references.

If people were asking for new 'search for all X transactions that match Y' things in algod - I'd agree. It doesn't belong there. For anything that's 'give me the currrent state', algod 100% can/should be able to answer that efficiently.

Having the returned asset data just return all the info like it already does for 'created assets' would do it (could be optional), or a 'batch fetch' if it was desired to be limited in some way is all that's being requested here.

Sometimes it really feels like the core team has literally no idea how people actually build/develop and deploy for Algorand, nor the actual costs/implications of each decision.

winder commented 1 year ago

Algod is designed to store data for efficient for blockchain operation, not for querying. Sometimes it can do both, but not always. Indexer is designed to make data more accessible and easier to use. In that context features like this make a lot more sense for the Indexer than algod.

AlgoD in fact has to, by design, be efficient at fetching/caching this data, because it has to fetch for every transaction - pre-loading from foreign references.

This statement is wrong. The AssetHolding object (the one with asset-id, amount and is-frozen) contains exactly the data required to efficiently apply an asset transfer transaction. This is a problem because the asset url, the asset name, and how many decimals to display are part of the creator account!

Take an account which has opted into 1000 assets. With the current API that's a single query for the account in a couple tables. To join in asset details is an additional 1000 joins. (The storage model changed a bit with the "unlimited assets" feature last year and I'm a little fuzzy on the details -- suffice to say that it doesn't simplify adding this feature to algod)

Sometimes it really feels like the core team has literally no idea how people actually build/develop and deploy for Algorand, nor the actual costs/implications of each decision.

You're not entirely wrong. A big part of our mission is to keep the consensus protocol operating efficiently. We are still trying to improve usability. Situations like this one are tricky, the goals are at odds with each other.

pbennett commented 1 year ago

Thanks for the response. It's a good point that the asset creation params are completely independent from the 'holdings' - it had been a while since I'd looked at the schema. Although if you query an account w/ v2/accounts/{address} that created the assets, it gives you all the asset parameters in the account response but not for holders which complicates the perception for the caller. Anyway - you say algod isn't designed for querying yet that's exactly what it's VERY often used for. Simple things like balances and asset holdings - it's silly not to hit your own nodes or public ones. algod nodes are prolific (if you want to use public ones) and are probably a given for most deployed dapps.
I see indexers as designed for searching. Running queries against multiple filters, possibly w/ >100-200ms response times (i'm being kind - the indexer can sometimes take 30s+ for queries). Asking for data on account X, asset Y, application Z... algod should be able to answer that efficiently. If I have a backend that has a caching infrastructure then going "outside" at all is 'bad' - that's $$$ - these things matter. At any kind of real scale, internet egress starts to cost - a lot. Storage can cost. Higher (hard) IOPS requirements can cost. Running redundant multi-zone non-archival algod instances behind an internal load-balancer isn't too bad. So.... personally at least, I think algod SHOULD be good at querying 'state' but when it comes to 'searching' - that's best left to the indexer or custom infra updated by things like conduit or custom code. I would fight against algod taking on that mantle - just as I disagree algod itself should have some kind of streaming event-bus mechanism built-in.

Cheers...

fergalwalsh commented 1 year ago

My current thought (a controversial opinion, I'm sure...) is that maybe the algod should only be used for submitting transactions and getting the state deltas for data.

I left a 👎 so I want to explain: I agree that the node shouldn't be doing searching or queries involving joins but I definitely am opposed to the above idea that it shouldn't serve current state.

In the dApps and wallets I've worked on we have always made an explicit choice between the data we fetch from Algod vs the Indexer. We always use Algod for balances and state information (including boxes now). We do this for 2 reasons: latency and reliability. The data from Indexer is always going to be behind the node (by definition), often by some seconds. As block sizes increase this gets worse. The added delay of waiting for a balance update on the indexer compared to an Algod is the difference between an OK user experience and a bad user experience.

On the reliability side, an indexer is more likely to fail or go slow than a node because of its more complicated infrastructure and data layer. It is also dependent on the node so the combination is less reliable than the node alone. We have seen this many times in practice and Conduit doesn't change anything here in general.

I wonder what other people's perspectives are. I believe on Ethereum you simply cannot get this kind of information directly from their nodes (though I'm no expert on Ethereum).

I don't think Ethereum should ever be used as an example in an argument of what Algorand should or shouldn't do. Ethereum is a beautiful concept but a terrible piece of engineering.

barnjamin commented 1 year ago

I filed a PR that provides the ability to get all asset details for assets held by a given account in #5290 as a draft. It should serve as an illustration of what is meant by the data in the DB not being organized for efficient querying. However, that should be balanced against the number of single http requests that would be fired off to get the same data.

I'd like to find a way to support this request since it's not very complex and provides something a lot of folks need.

Are there specific metrics around what would be acceptable here wrt performance? and how can we compare that with the N separate http calls for grabbing asset details?

algoanne commented 1 year ago

Thanks @fergalwalsh , I was wondering if someone would bring up latency as a factor in favor of algod serving current state.

In the dApps and wallets I've worked on we have always made an explicit choice between the data we fetch from Algod vs the Indexer. We always use Algod for balances and state information (including boxes now). We do this for 2 reasons: latency and reliability. The data from Indexer is always going to be behind the node (by definition), often by some seconds. As block sizes increase this gets worse. The added delay of waiting for a balance update on the indexer compared to an Algod is the difference between an OK user experience and a bad user experience.

Separately, one of my colleagues pointed out that we are going to have to solve this problem for algod anyway if we want to make simulate powerful (e.g. allowing simulate to optionally run with unlimited access to foreign resources). So, I think we can consider how to configure an algod to be able to handle heavier requests and we will have to figure out how to mitigate risk there. On Indexer we have this per-endpoint/parameter config file to turn each one on or off. Maybe something like that will work for algod as well? I'm not promising anything here, but thinking out loud.

A question I have - is there a finite number of requests such as the original one on this issue?

fergalwalsh commented 1 year ago

If the node could be optimised for it I think batch requests would be the most pragmatic. It looks like batches for assets and boxes especially would be useful and dramatically reduce request traffic to nodes.

The OP wants to reduce 1+N requests to 1 but with batch requests you could reduce it to 1+(N/batch_size). Which would be just 2 for many cases. It would still be a big improvement. This only looks at the resources table for assets and only the kv store table for boxes so it could be efficient.

is there a finite number of requests such as the original one on this issue?

Probably not. But there should be a very small number that CAN be served by the node but are not currently. They are the frustrating ones because they feel like a missed opportunity to make things more efficient for everyone.

jannotti commented 1 year ago

Batching would also have the benefits of a) applying to many different APIs and b) allow for easy accounting if API providers wanted: "You have 100,000 'credits', one credit is used per lookup, including batched lookups"

grzracz commented 1 year ago

Just wanted to pop in to say I am completely in favor of this change (and more). Is the current database schema available somewhere in any format? This will help visualize the additional overhead.

LoafPickleWW commented 7 months ago

We need this.

In fact it's quite lame we don't have it already.

joe-p commented 7 months ago

I imagine the best path forward is for someone to pickup the work done in https://github.com/algorand/go-algorand/pull/5290. I imagine this would have to be a gated endpoint

gmalouf commented 7 months ago

Hi Folks,

This got lost in the shuffle a bit. We think the solution to this is a good fit behind our experimental API flag. While there are performance concerns, giving people the ability to use the API (and manage the performance consequences, pagination sizes, etc) on their own node makes sense here.

We are wrapping up pushes on initial support for non-archival relays, P2P phase 1, and incentives after which we plan to pick this up.

gmalouf commented 6 months ago

Folks, I'm adding some tests and making tweaks/fixes, but give #5948 a look - see if it is hitting the mark here. We didn't want to refactor the underlying data store, but think this will be workable to be included in next release.

gmalouf commented 5 months ago

What are wallets currently showing in the case that a user is opted into (but does not hold any of since impossible) an asset that was deleted by its creator (meaning asset params no longer exist)? They have a 0 balance of the asset, but it still impacts their MBR so I assume something gets shown.

nullun commented 5 months ago

I believe just the asset ID remains in the account on the ledger. So goal displays the following:

% goal account info -a BOBBYB3QD5QGQ27EBYHHUT7J76EWXKFOSF2NNYYYI6EOAQ5D3M2YW2UGEA
Created Assets:
    <none>
Held Assets:
    ID 1004, <deleted/unknown asset>
Created Apps:
    <none>
Opted In Apps:
    <none>
Minimum Balance:    200000 microAlgos

An Indexer will retain more details, so wallets may display more info.

% curl "127.0.0.1:8980/v2/assets/1004?include-all=true&pretty"
{
  "asset": {
    "created-at-round": 4,
    "deleted": true,
    "destroyed-at-round": 7,
    "index": 1004,
    "params": {
      "clawback": "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAY5HFKQ",
      "creator": "ALICE7Y2JOFGG2VGUC64VINB75PI56O6M2XW233KG2I3AIYJFUD4QMYTJM",
      "decimals": 0,
      "default-frozen": false,
      "freeze": "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAY5HFKQ",
      "manager": "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAY5HFKQ",
      "reserve": "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAY5HFKQ",
      "total": 0
    }
  },
  "current-round": 7
}

AlgoSigner just shows the asset ID with zero units. I'll see if I can find what other wallets are doing.

LoafPickleWW commented 5 months ago

Hey all, I have a lot of experience with this since it was an issue for NFTs that were deleted and 1000 blocks passed so a nuanced way was needed for people to opt out.

In short, the wallet apps filter out deleted assets so you couldn't use that and we had been having the opt out send back to the creator address, but once you delete an asset, the creator address we would close out to was somehow invalidating the tx. We mitigated this issue by having deleted assets get opted out to the sender address and it works fine on Defly wallet, but there is still a bug Pera is sorting.

Here is an asset that is deleted per Nullen's request:

Asset ID: 1240389879 Wallet Held: O2ZPSV6NJC32ZXQ7PZ5ID6PXRKAWQE2XWFZK5NK3UFULPZT6OKIOROEAPU

gmalouf commented 3 months ago

For everyone here, this endpoint went out with the release on May 20 - fingers crossed you find this useful!

cc @SilentRhetoric @pbennett

pbennett commented 3 months ago

Yes, I saw, thank you! It's a bit obscure atm and still undoc'd though (needs updated in rest docs) ? Every node someone might hit has to have "EnableExperimentalAPI": true set in the config, and then know the new endpoint is /v2/accounts/xxxxxx/assets

So needing that flag basically means we can only reliable call our 'own' nodes for this endpoint right now. algonode doesn't have it enabled for eg (although I'll ask if they'll enable).

gmalouf commented 3 months ago

So the docs changes were merged 9 hours ago: https://github.com/algorandfoundation/docs/pull/1262 - I don't recall the Foundation's process for refreshing the dev portal but assume it will happen soon.

To better understand performance impacts, we chose to release this initially behind the EnableExperimentalAPI flag to gather feedback/see if this design makes sense before committing to longer term support. My understanding there was definitely benefit for being able to get this on your own endpoints (per the conversation earlier in this thread) independent of API service runner decisions.

Anyway, hoping some folks use this and let us know if its useful, we can then look to get it out from behind the experimental flag.

algorand / go-algorand