elixir-cloud-aai / cloud-components

Reusable components for the ELIXIR Cloud
https://elixir-cloud-components.vercel.app
Apache License 2.0
9 stars 14 forks source link

feat: add file input selection and deletion #173

Open JaeAeich opened 10 months ago

JaeAeich commented 10 months ago

Description

For the attachments, it would be nice (and I think also important) to see the names of the files that have been uploaded, especially since it may be necessary to select one of them as the "Workflow URL". And I think it is a better user experience if the user can delete individual files, and if a new selection of files does not clear the old selection. So for example, I might select a primary CWL descriptor file and two secondary descriptor files. But accidentally I have selected a wrong file. So I should be able to remove the wrong file and then select only the single missing descriptor file, without having to select again the two others.

As an extension of the issue above, it would actually be great if after selecting one or more files for upload and listing their file names and a button to remove them, there would also be a checkbox that would optionally allow selecting exactly one file as the Workflow URL. If selected, this should auto-populate (and ideally hidden or grayed out unless a file is unselected) the "Workflow URL" field with the name of the file. As an alternative (possibly even better) we could maybe have an upload file button next to the "Workflow URL" field (maybe with an OR in between) that can only be used to select the primary descriptor file. If it is used, the workflow URL is then auto-populated. Additional files can then still be attached through the button/field at the bottom.

uniqueg commented 10 months ago

Thanks @JaeAeich!

Regarding the file attachments, I think the following clarification from the WES specification is important:

The workflow_attachment array may be used to upload files that are required to execute the workflow, including the primary workflow, tools imported by the workflow, other files referenced by the workflow, or files which are part of the input. The implementation should stage these files to a temporary directory and execute the workflow from there. These parts must have a Content-Disposition header with a "filename" provided for each part. Filenames may include subdirectories, but must not include references to parent directories with '..' -- implementations should guard against maliciously constructed filenames.

The ability to specify file paths will allow reconstructing a directory tree for the uploaded files. This is crucial, because workflow directories are generally not flat. In fact, best practices for most workflow languages prescribe complex nested workflow directory structures, e.g., Snakemake.

This means that we would have to find a way to:

We should put some thought into designing this in a way that is not too painful and error-prone for the user.

One user-friendly alternative to setting file paths (and selecting multiple files) manually, we could allow users to upload entire directories of files, which we would then parse to automatically create the file paths for the Content-Disposition headers from, according to the directories' subdirectory structures.

All files (whether selected manually or as part of a directory or its subdirectories) could then be used to populate a file table, which could be further amended by going through the file and directory selection process multiple times (double entries should be filtered automatically and a maximum number of files should be enforced as well).

The user could manipulate the file table to remove individual files (and possibly individual subdirectories in one go, if sorted accordingly?) and to optionally select at most one primary descriptor file via a checkbox. Checking one of the checkboxes should auto-populate and hide the "Workflow URL" field until unchecked (and, of course, checking a checkbox for a different file should uncheck the previously checked box, leave the "Workflow URL" field hidden and change its hidden value). The table could also include a column that is collapsed by default but could be uncollapsed to manually edit the file paths of each file. This column could be autopopulated on a best guess basis, i.e., include any subdirectories if parsed from a selected directory, or not include any subdirectories if selected directly/manually.

I think it would then make sense to put this file selection at the very top of the form to signal to the user the importance of selecting a workflow and all required files first, before doing anything else - especially because the user's choices on the file upload determine whether a "Workflow URL" needs to be provided or not.

As far as I can see, this design would cover for most common use cases:

We could even extend the file table by adding another set of columns of checkboxes for auto-populating "Workflow parameters" and "Workflow engine parameters" (grayed out for any files that aren't .yaml or .json files). Again, only at most one file could be selected for auto-populating the contents. However, in this case, I would probably leave the fields editable, so that users could make use of template files.