bcgov / cas-ggircs

Climate Action Secretariat's Greenhouse Gas Industrial Reporting and Control System
Apache License 2.0
3 stars 4 forks source link

As a Compliance Officer, I want to directly access the documents listed in the file viewer so that I can look at the contents of the files #288

Closed matthieu-foucault closed 3 years ago

matthieu-foucault commented 3 years ago

Description

Part of the compliance process involves opening the various attachments submitted with a SWRS report. Given the number of reports, attachments, and the fact that they are located across all archives, being able to download reports in a timely manner is paramount.

This user story's work is comprised of a metabase question (with a custom column using the concat function) that uses the attachment's and report's info to generate a link to ggircs application. This will allow the user to access links to documents directly from the file comparison viewer.

~The file name in the URL should be URL-encoded (spaces replaced with %20, and other special characters, if any are present in the attachments file names)~ Modern browsers don't seem to mind spaces and slashes

The ggircs application already supports direct download of files through its API, so no work is needed on that end.

Scenarios

the swrs_history.report_attachment table has the following columns:

The swrs_transform.load_report_attachment(), needs to, for each record in the swrs_transform.historical_report_attachment_data materialized view, find the corresponding record in the swrs_extract.eccc_attachment table, based on the swrs_extract.eccc_attachment.swrs_report_id, swrs_extract.eccc_attachment.source_type_id and swrs_extract.eccc_attachment.uploaded_file_name columns.

Given that I am logged in metabase, When viewing a question named attachments with download links Then I can see a column named download_link, with a link to the ggircs application.

Given an attachment with file a path of Output_Prod%2FReport_1234_2020_SourceTypeId_42_some%20file.pdf and a zip_file_name of GHGBC_PROD_20200101.zip When viewing the download_link column in the attachments with download links question Then the link should be https://cas-ggircs.apps.silver.devops.gov.bc.ca/api/eccc/files/GHGBC_PROD_20200101.zip/download?filename=Output_Prod%2FReport_1234_2020_SourceTypeId_42_some%20file.pdf

matthieu-foucault commented 3 years ago

This issue's description needs to be updated. We discovered that older attachments do not follow the same naming convention, so we'll have to find the attachment's full path in the ETL process and record it in the db

LindsayMacfarlane commented 3 years ago

Thank you for the demo @matthieu-foucault!