lajarre commented 2 years ago

See https://github.com/butterdao/whip/issues/24#issuecomment-1171734422

What we mean by "complex address" is an address that has a lot of ERC20 or native token transfers. Covalent doesn't play well with these.

TBD:

[x] Build a list of such complex addresses, so we can test new solutions with them.

lajarre commented 2 years ago

Approaches

I see 2 main types of solutions:

Find another external data source, different from Covalent and/or Bitquery which we are using today.
Build our own database as sketched in https://github.com/butterdao/whip/issues/49#issue-1272597420

My take:

Either there is an outstanding high-quality (clean data, low latency, stable, well-documented API) external data source at a fair price that we easily integrate, in which case we take route 1 with it.
Either we should rather take route 2 and spend our time improving our system by making it more robust and resilient, rather than spending any more time contemplating other APIs.

@acemasterjb What are your thoughts?

acemasterjb commented 2 years ago

@lajarre For solution 2, we might still end up relying on at least one other API to accomplish this.

The way I see it, for solution 2 we'll need to query a given treasury for all ERC20 tokens it holds (this might have to be via Ethplorer), and then query the historical balance for each of these assets using Eth RPC getter calls (via a node operator like Infura or Pokt). At which point we can store this data in our db and use it for our benefit.

As for alternative data providers, Ethplorer can give up to 1000 responses per call per account or Moralis, which is a pretty popular solution (but from personal experience with this service, I think it should be used as a last resort).

Edit: an additional remark on solution 2: we could build and then append our own database as suggested, but I think we'd still need to do two queries. One to check for new tokens each treasury in our db holds periodically, and another to check transfer history between last tick and current tick (where a "tick" is each time this query process happens).

acemasterjb commented 2 years ago

I ran the scheduler locally with a frequency of 30 secs for about a day or so. Whilst it was running I logged every address that gave these Covalent ReadTimeout issues which are most likely "complex addresses".

Here are the results, where frequency is how often within that 1-ish day that this issue occured. I filtered out outliers: addresses that infrequently cause this error. Here's the unfiltered list.

lajarre commented 2 years ago

Here are the results, where frequency is how often within that 1-ish day that this issue occured. I filtered out outliers: addresses that infrequently cause this error. Here's the unfiltered list.

Thanks for that! From which list are the addresses you used coming?
This makes the case for making this issue a priority.

Edit: an additional remark on solution 2: we could build and then append our own database as suggested, but I think we'd still need to do two queries. One to check for new tokens each treasury in our db holds periodically, and another to check transfer history between last tick and current tick (where a "tick" is each time this query process happens).

I'm wondering if we couldn't just:

list all tokens we want to track
list all treasury addresses we want to track
for each token
- for each address:
  - get the Transfer events with _from = <address> or _to = <address>.

I'm not sure how expensive this would be (using Infura for example, but I'm also wondering if a private node wouldn't be faster here).
What's sure is that whenever there is a new treasury or a new token to be added, the database would require quite some time to get updated.

Also, there might be some tokens which are incorrectly implementing the Transfer event (sleepminting). But I guess the Token Lists we base our system upon are clean.

All in all this option is really about building own own transfers database right on top of the blockchain data.
Going this route 100%, we would remove Covalent/Bitquery as data sources. But that would entail that users wouldn't anymore be able to query a new address. The only option would be a "add address" form that might need selection/curation and would take potentially hours or days to process (replaying the chain of transfer events for each token).

@vaughnmck any thought on that?

acemasterjb commented 2 years ago

From which list are the addresses you used coming?

This is coming from CryptoStats' simple treasury list/adapter.

I'm not sure how expensive this would be (using Infura for example, but I'm also wondering if a private node wouldn't be faster here).

Just a rough estimate would be, considering we would need data for each treasury, for each asset/token they have, considering a new block is mined every ~14s:

365 days * 24 hrs * 60 mins * 60 secs / 14 secs per block * number of tokens per treasury * number of treasuries we track

Where number of tokens per treasury, maybe we can say 25 wors-ish case scenario, for number of treasuries we track it's currently 108. So the total estimate is 6, 081, 942, 857 block getter calls per caching operation. This number also needs to scale/grow.

Again this is just a worse case scenario, but for reference here are Infura's pricing plans.

lajarre commented 2 years ago

I guess this is even worse, because we would need to start indexing from the genesis block, not just from 1 year ago.

If we have the blockchain on disk, this is a nested loop like:

for each block:
- for each ERC20:
  - for each address:
    - query Transaction and store in memory.

Then we would need to dump memory regularly to a database, which we would then need to re-process in order to have the clean treasury.asset balance series.

vaughnmck commented 2 years ago

My answer is based entirely on resources.

We do not have the resources to pay Infura for ~6bn calls so we should rule it out until we do.

We may have enough resources to run our own node, however. (Running a full node on AWS is ~$1k/y)

I'd recommend maintaining Covalent/Bitquery as a quick retrieval system for when a user searches for new tokens/treasury addresses. We can always backfill the response with more accurate transaction data once we've re-processed the chain.

lajarre commented 2 years ago

@acemasterjb What about the following solution: in reload_treasuries_stats, keep track of treasuries for which build_treasury_with_assets fails, and keep retrying to fetch these until Covalent stops timing out.

We should probably have, on top of our treasuries Redis set, a fixed list of treasuries that is hardcoded.

acemasterjb commented 2 years ago

@lajarre Interesting... What do you think about appending those treasuries in that Redis set you proposed on fail and have an hourly/frequent Celery task that runs through, and exhausts, that list retrying to get covalent data?

Also, I should say that a few commits ago I bumped up the read and connect timeouts for _get_portfolio_data and _get_transfers_data to 60 and 90 seconds respectively and it seems like those treasuries are still failing on the scheduled task. I can also try increasing the read timeout further since 9/10 that's the timeout that usually occurs.

lajarre commented 2 years ago

@lajarre Interesting... What do you think about appending those treasuries in that Redis set you proposed on fail and have an hourly/frequent Celery task that runs through, and exhausts, that list retrying to get covalent data?

Sounds like a good approach!

Also, I should say that a few commits ago I bumped up the read and connect timeouts for _get_portfolio_data and _get_transfers_data to 60 and 90 seconds respectively and it seems like those treasuries are still failing on the scheduled task. I can also try increasing the read timeout further since 9/10 that's the timeout that usually occurs.

Why not increase the timeout. Ideally we want to have different timeouts when we're running a cron compared to when we're running live.

acemasterjb commented 2 years ago

Ideally we want to have different timeouts when we're running a cron compared to when we're running live.

Wouldn't this be inconsequential? I previously bumped up the timeouts under that assumption since the deployment will timeout after 30s anyway.

lajarre commented 2 years ago

Wouldn't this be inconsequential? I previously bumped up the timeouts under that assumption since the deployment will timeout after 30s anyway.

Yes that's true for now. Let's just keep this in mind for now. This will have an impact whenever we start changing the backend architecture (notably #91 ).

butterygg / whip

"Complex" addresses produce Covalent portfolio_v2 timeouts #66

Approaches