ethereum / sourcify

Decentralized Solidity contract source code verification service
https://sourcify.dev
MIT License
776 stars 385 forks source link

Share the repos in an alternative way to S3 #1586

Closed kuzdogan closed 6 days ago

kuzdogan commented 3 weeks ago

We back up our repository (files) on S3 buckets but and tell people to download the repo from there. However it's difficult to get the repo via S3 because it's a directory of millions of files and it takes quite long to download everything.

Even though the main way to retrieve Sourcify's repository will be the DB exports, we should still provide an easy way to get the files. This could be something like a regular zip of the whole repo.

We can create a regular job that creates a zip of the repository and uploads it to a public bucket. Cloudflare R2 here would be useful because there are not download costs. We use it in the VerA parquet. Let's create a Sourcify account and upload there. We should name the .zip with a date to show when it was uploaded.

In this case, however, we should have an additional manifest that denotes when the repo was uploaded because the current manifest.json denotes when the stats.json is created. Should we add a description field to both manifests so that people can understand the difference? The new manifest.json can be next to the .zip file, while the other is next to the contracts/ folder. Similar to the VerA manifest.json this should include the file(s) and their sizes (See https://github.com/verifier-alliance/parquet-export/issues/4).

Something like:

{
  "description": "Manifest file for when the Sourcify file repository was uploaded"
  "timestamp": 1723737024141,
  "dateStr": "2024-08-15T15:50:24.141998Z",
  "files": [
      {
          "path": "sourcify-repository-2024-08-15T15:50:24.141998Z.zip",
          "sizeInBytes": 26875277986
      }
  ]
}

Questions

marcocastignoli commented 1 week ago