gigascience / gigadb-website

Source code for running GigaDB
http://gigadb.org
GNU General Public License v3.0
9 stars 15 forks source link

Frontend Work for the Redevelopment of File Upload Wizard #1601

Open rija opened 10 months ago

rija commented 10 months ago

Redevelopment of File Upload Wizard

Rationale

The File Upload Wizard, shortened as FUW from here on, is an Internet based software system to:

Key subsytems

The system has 6 pillars (or tent poles) that are integrated together:

  1. The Uppy file uploader component (https://uppy.io), to batch upload files and associated metadata (generic and custom)
  2. The open protocol for resumable file upload (https://tus.io) and its reference server (https://github.com/tus/tusd), because upload of many files and big files can take a very long time and failures can happen and there is need for resumability and post-processing
  3. Linux file system and the tooling to work with it, like Flysystem (https://flysystem.thephpleague.com/v1/docs/), pure-ftpd (https://www.pureftpd.org), and Linux inotify (https://en.wikipedia.org/wiki/Inotify) through Watchdog (https://pypi.org/project/watchdog/) because various actors need multi-modal access to the storage (command line, ftp, application)
  4. A message queue subsystem, Beanstalkd (https://beanstalkd.github.io) to allow commands triggered from the UI to be sent to backend workers that will execute jobs asynchronously (as batch upload and process of many files can take a very long time)
  5. A list of stages in the dataset publishing workflow defined by Gigascience curators and editors defining a dataset state
  6. The existing GigaDB website that provide the dashboard for the various actors involved to orchestrate their part of the workflow, and that provide user account management for authors and curators, as well as state management and viewing gallery for datasets

How it works

The UI for the workflow is centered on a Vue.js 2 application that's embedded within the GigaDB website that communicate to a REST API (deployed as two standalone Yii 2 applications called fuw-backend and fuw-public) and that send jobs to a message queue. Uppy is a third party component that fulfills the main functionality - resumable file upload - in tandem with the backend server Tusd (both Uppy and Tusd are from the same company). Additional third party components comes from the component library Element-UI and are used for some UI interactions.

The current (and initial) implementation of the system is illustrated in this user-centric video and it implements the stages described in this user workflow diagram.

The code and infrastructure for the system is already in gigadb-website repo but has been disabled two years ago.

The architecture of the system is described in this architecture diagram.

Work to do

By the time this project is to start (preferred to be 2nd January 2024), the FUW system would be re-enabled in the same state as when it was disabled and as shown in the video above , and behind a proper feature flag.

The work the tech team needs to perform, driven by curators' feedback and by observations from tech team during the current work for re-enabling the system, is listed as:

The frontend effort to accomplish the above work is listed as:

luistoptal commented 8 months ago

I created an issue related to accessibility and styles for this feature #1695