Closed andrewhercules closed 3 years ago
explore idea about generating schema and dataset description within the step
Epic closed as initial implementation of data downloads page available. Further work (e.g. exposing new datasets, including links to BigQuery and GraphQL) will be captured in subsequent tickets.
Currently, the classic (Angular) version of the Platform allows users to download the evidence, association, target list, disease list, safety, baseline expression, and tractability data files.
The ETL pipelines for the new (React) version of the Platform allow us to increase the number of files that we make available for download. These pipelines generate the following files:
To optimise and streamline the maintenance of the data downloads page (https://beta.targetvalidation.org/downloads), the back-end team will produce a static JSON file with a list of datasets available for download. This list will include:
An example of the proposed JSON is below:
The front-end will access the file either through Google Cloud or the GraphQL API, retrieve the list of files available for download, and display the list on the
/downloads
page. As part of this process, the front-end will also be responsible for:gsutil
command to access the data (listing files and downloading)After discussions amongst the team, a few points were were noted for further discussion:
gsutil
script to list files in directory and download filesopen-targets-prod
releases directory whereas static file for QA points to files inopen-targets-eu-dev
releases directoryTo do: