Open Dadoof opened 8 months ago
I would assume that a nice pull request to improve this would be welcome.
Two things to check though:
Hi there,
As for the proper benchmarking, you are correct. For me, anecdotally, it is a good deal quicker. Did this today:
time aws s3 cp --no-sign-request s3://ecmwf-forecasts/20240202/12z/0p25/enfo/20240202120000-0h-enfo-ef.grib2.
real 0m13.074s
user 0m9.010s
sys 0m10.453s
time wget https://ecmwf-forecasts.s3.eu-central-1.amazonaws.com/20240202/12z/0p25/enfo/20240202120000-0h-enfo-ef.grib2
real 1m2.003s
user 0m2.418s
sys 0m5.887s
Indicating that, for this one simple case, the movement from the S3 bucket is a bit quicker (13 seconds vs 1 minute)
As for that EC2 instance: I was merely hoping that would be an option, not that it would replace any other capabilities. That if one wanted to use S3 buckets rather than AWS HTTP sites as the location to get data from, that option would exist.
Regards, Brian E
To pull files directly from the S3 URI (s3://...), the backend would need to utilize boto3
instead of requests
. I think it would be best to start building this capability in the multiurl
dependency which executes the downloads.
Hello there folks,
Was making use of this opendata, to get that new 0.25 degree data. I noticed something that I would like investigate.
As is stands now, the tools I see, namely 'client.py', pulls data from a URL. For example, something like: wget https://ecmwf-forecasts.s3.eu-central-1.amazonaws.com/20240227/12z/0p25/enfo/20240227120000-0h-enfo-ef.index
I believe that, from an amazon AWS EC2 instance, this would be a faster pull mechanism: aws s3 cp --no-sign-request s3://ecmwf-forecasts/20240227/12z/0p25/enfo/20240227120000-0h-enfo-ef.index .
Those are command line steps, of course, Inside client.py and such, it would be different tools. My description above was simply to show the difference between a pull via HTTPS and AWS S3.
Any chance of adding capability to pull from the S3 bucket (and thus, the AWS 'backbone') into an AWS EC2 instance, rather than HTTP?
Regards, Brian E.