@rushirajnenuji @sfisher Hi Scott and Rushiraj,
Here are the major changes for refactoring batch download to use S3 bucket instead of local disk:
defined variables for s3 bucket
created core s3 bucket functions in impl/s3.py including
upload file from local disk to S3
define presigned url
modified the proc-download.py script (the asynchronous queue to manage batch download request) to upload the report file to s3. Currently it copies the report file from a local path to a public path that can be accessed through an URL.
created a new API end point /s3_download/filename for user to download the report file from S3. This API generates a presinged URL against the report file on S3, then redirects user to that url to download the file. I avoided to use the /download end point as it is mapped to a local folder with public access granted through web configuration.
Your download request is being processed and will be available in a few minutes.
When it is ready to download, an email with the download link will been sent to the email address affiliated with your EZID account: ezid@ucop.edu.
After you receive the email, you may also download a .csv file of the requested identifiers using the link below:
https://ezid-dev.cdlib.org/s3_download/QEr6L4MN3Mvv5eY1.zip
The download link will expire in 1 week.
To test using the client batch-download tool (with VPN or on UCOP network):
@rushirajnenuji @sfisher Hi Scott and Rushiraj, Here are the major changes for refactoring batch download to use S3 bucket instead of local disk:
impl/s3.py
including/s3_download/filename
for user to download the report file from S3. This API generates a presinged URL against the report file on S3, then redirects user to that url to download the file. I avoided to use the/download
end point as it is mapped to a local folder with public access granted through web configuration.To test on UI (with VPN or on UCOP network):
MANAGE IDs
tab,DOWNLOAD ALL
button. EZID will display the file download notification on the screen.proc-download
job on the ezid-dev server if it is not started.https
withhttp
when running test on the ezid-dev server, for example:https://ezid-dev.cdlib.org/s3_download/QEr6L4MN3Mvv5eY1.zip => http://ezid-dev.cdlib.org/s3_download/QEr6L4MN3Mvv5eY1.zip
File download notification:
To test using the client
batch-download
tool (with VPN or on UCOP network):batch-dwonload.sh
script: a. replace ezid-prd with ezid-dev: change line 6 url="https://ezid.cdlib.org/download_request" to url="http://ezid-dev.cdlib.org/download_request"b. Replace "https" with "http" for the $url parameter before calling the "curl -f -O -s $url" command
proc-download
job on the ezid-dev server if it is not startedhttp://ezid-dev.cdlib.org/s3_download/iQyba5CT17K6EkLk.csv.gz
Please let me know if you have questions.
Thank you
Jing