ethereum / sourcify

Decentralized Solidity contract source code verification service
https://sourcify.dev
MIT License
775 stars 384 forks source link

Create stats-gen regular job #1511

Closed kuzdogan closed 1 month ago

kuzdogan commented 1 month ago

After making our database the source of truth and moving to GCP we need to create a regular job to generate the https://repo.sourcify.dev/stats.json file.

Currently this is done by a simple bash script that traverses all chain directories and counts the contracts: https://github.com/sourcifyeth/infra/blob/master/environments/staging/applications/stats-gen/values.yaml

I think with a newer version we should instead use DB queries to get the number of contracts under each chain. We can also remove the full_match_size_kbyte and partial_match_size_kbyte fields as they are not so relevant anymore. With that we should be able to run the whole job in a few minutes instead of now taking couple hours.

In the end this should be a separate sourcifyeth/stats-gen repo with a simple script in the language of choice that is Dockerized. We can do the image build and publishing manually as I don't expect this to change frequently. The container should run regularly in a GCP scheduled job and modify the stats.json and manifest.json files.

This setup means the stats.json and manifest.json files will only be available under repo.sourcify.dev/stats.json and not sourcify.dev/server/repository/stats.json because we don't do a full directory static file mount at /repository anymore but IMO that's fine. Technically this is not part of the API. We'll just continue serving the static stats.json and manifest.json in the repo.sourcify.dev and we'll think about that later if we are to change the repo.

marcocastignoli commented 1 month ago

Done here: https://github.com/sourcifyeth/stats-gen