CDLUC3 / ezid

CDLUC3 ezid
MIT License
11 stars 4 forks source link

Refactor batch download to avoid using local disk #521

Closed jsjiang closed 7 months ago

jsjiang commented 10 months ago

Refactor batch download to avoid using local disk. Possible solutions:

Tasks

jsjiang commented 10 months ago

Related to ticket #99

jsjiang commented 8 months ago

Current workflow:

jsjiang commented 8 months ago

Solution option 1 "store file on S3 and generated a URL for download" requires minimum code change and is a good candidate for the first round refactoring without changing queue services.

jsjiang commented 8 months ago

subtasks:

jsjiang commented 8 months ago

Investigate if the "Download report in CSV format" in the Dashboard page requires refactoring.

jsjiang commented 8 months ago

The "Download report in CSV format" link calls the "impl.ui_admin.csvStats" function which uses "io.String" and "django.http.HttpResponse". This workflow does not save file on local disk and does not require refactoring with S3 implementation.

f = io.StringIO()
    w = csv.writer(f)

return impl.ui_common.csvResponse(f.getvalue(), fn)

def csvResponse(message, filename):
    r = django.http.HttpResponse(message, content_type="text/csv")
    r["Content-Disposition"] = 'attachment; filename="' + filename + '.csv"'
    return r
jsjiang commented 8 months ago

Batch download API:

Batch download UI function:

jsjiang commented 8 months ago

Email notification after request has been submitted:

Thank you for using EZID to easily create and manage your identifiers. The batch download you requested is available at:

https://ezid-stg.cdlib.org/download/Cn8XypqbcfvFRHdr.csv.gz

The download will be deleted in 1 week.

Best,
EZID Team

This is an automated email. Please do not reply.
jsjiang commented 8 months ago

The downloadable file for example https://ezid-stg.cdlib.org/download/Cn8XypqbcfvFRHdr.csv.gz is managed by the web server which give public access to the downloaded file.

One of the S3 solution option is to create a new API which resolves this link to a presinged URL: https://ezid-stg.cdlib.org/download/Cn8XypqbcfvFRHdr.csv.gz =>Define a presigned URL to the file (Cn8XypqbcfvFRHdr.csv.gz) saved in S3 => return the url for user to download the file

jsjiang commented 7 months ago