davidgasquez / gitcoin-grants-data-portal

🌲 Open source, serverless, and local-first data hub for Gitcoin Grants data!
https://grantsdataportal.xyz/
MIT License
25 stars 3 forks source link

feat: 🎨 iterate assets and resources #62

Closed davidgasquez closed 4 months ago

davidgasquez commented 5 months ago

With this PR, I'm trying to both simplify the assets code as well as move to best practices.

These are some of the changes:

Closes #8, #39, #51.

davidgasquez commented 5 months ago

I might be bringing their API down :see_no_evil:

requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://grants-stack-indexer-v2.gitcoin.co/graphql
davidgasquez commented 5 months ago

@DistributedDoge, sorry for the large PR. I don't know what got me :sweat_smile:

Mind giving a super quick review? There are a lot of changes, hopefully trying to make things simpler.

DistributedDoge commented 5 months ago

Step in good direction, and great refactor!

Concurrency limit (preferably by asset-tag so only Allo resources are affected) could help to make sure we don't hammer indexer API beyond what it can handle:

https://docs.dagster.io/guides/limiting-concurrency-in-data-pipelines

I also don't know how stable the indexer V2 API is under best circumstances, so error handling injected into GrantsStackIndexerGraphQ class could help?

davidgasquez commented 4 months ago

We got more data, and easier query logic so props to Allo team here.

Yes! :rocket: Having the GraphQL endpoint helps a lot.

Concurrency limit (preferably by asset-tag so only Allo resources are affected) could help to make sure we don't hammer indexer API beyond what it can handle.

I think the issue is not as much the concurrency of all assets but the donations one hammering the API when paginating. Making the requests a bit smaller and perhaps adding some jitter between will help.

I also don't know how stable the indexer V2 API is under best circumstances, so error handling injected into GrantsStackIndexerGraphQ class could help?

Definitely! Error handling + smart retrying needs to be there.


Got a beautiful green DAG after running make clean && make run.

image