flashbots / mev-boost-relay

MEV-Boost Relay for Ethereum proposer/builder separation (PBS)
https://boost-relay.flashbots.net
GNU Affero General Public License v3.0
414 stars 116 forks source link

Extend bulk-data homepage to show more entries #424

Open metachris opened 1 year ago

metachris commented 1 year ago

The relay bulk data export is only showing the first 1k files: https://flashbots-boost-relay-public.s3.us-east-2.amazonaws.com/index.html

This is because the AWS S3 XML only serves that many entries per page.

Task: Update https://github.com/flashbots/mev-boost-relay/blob/main/static/s3/index.html to either have pagination or simply query as many pages as there are (maybe listing in reverse order?)

xrchz commented 1 year ago

Are they still accessible if you guess the correct URL? As a workaround...

Valdorff commented 1 year ago

This bulk data would be extremely helpful for Rocket Pool if it could be made available.

We wouldn't need to hit it too much - would mostly grab once and then run locally.

metachris commented 1 year ago

Actually, you can query all the items by just using the XML page (instead of the HTML) at https://flashbots-boost-relay-public.s3.us-east-2.amazonaws.com and using the marker query argument:

https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjects.html

Marker is where you want Amazon S3 to start listing from. Amazon S3 starts listing after this specified key. Marker can be any key in the bucket.

I.e.: https://flashbots-boost-relay-public.s3.us-east-2.amazonaws.com/?marker=data/1_payloads-delivered/monthly/2022-09.json


We'd be glad about a code contribution to add pagination or query until there's no more entries to the HTML.

You can reproduce the issue by:

  1. create a S3 bucket with 1,100 files (can be of size 1b)
  2. use this HTML to serve an index: https://github.com/flashbots/mev-boost-relay/blob/main/static/s3/index.html (there are no changes necessary to that file)
  3. implement any changes to show more entries
metachris commented 1 year ago

The only dataset that's fully up-to-date is the delivered payloads by month: https://flashbots-boost-relay-public.s3.us-east-2.amazonaws.com/?prefix=data/1_payloads-delivered/monthly

Valdorff commented 1 year ago

Ok - thanks for that. Unfortunately we're looking at the builder submission side, so that doesn't help our immediate use case. Is the intent to add more and some bit of the pipeline broke, or were bulk builder submissions removed?

Thanks in any case 🙏