DiSSCo / SDR

Specimen Data Refinery
Apache License 2.0
7 stars 0 forks source link

Descriptive tasks #10

Closed benscott closed 1 year ago

benscott commented 3 years ago

Each pipeline component needs to be self-descriptive and self-validating.

Processing will be the same across each component and we do not want to reinvent the wheel. Write a wrapper where the component returns just the input/output schema, and the API & validation is handled by the wrapper.

Questions to be discussed with @PaulBrack

Need to consider these while developing #8

yvanlebras commented 3 years ago

Hi Ben,

Here some tests to give some content, I hope I am not totally out of scope ;)

To see existing "dummy" Gaalxy tools to treat images, you can have a look at this imaging dedicated instance: https://imaging.usegalaxy.eu/

Galaxy have preprocessing wrappers around components if you are thinking about abilities to use tools/scripts to pre-process data in the workflow.

Galaxy input/output requirements/validation per component is related to 1/ data types (so the data types used by the tools/workflow have to be specified as Galaxy datatype) and 2/ stdout/stderr and if there is an error, so stderr populated, the workflow stop.

PaulBrack commented 2 years ago

Removed POC milestone as this will require further work past the milestone

llivermore commented 2 years ago

Considered a "could" have feature in the latest review of MVP - review in July

llivermore commented 1 year ago

@stain and @OliverWoolland I think we should discuss some of the broader issues around handling FDOs in Galaxy and how we could handle validation between tools. Not sure there is a simple answer but certainly not something we can address before the end of SYNTHESYS+.

stain commented 1 year ago

We've agreed that to make FDO tooling for Galaxy will not be planned for this project as it is still too unclear from DiSSCO where/how openDS FDOs should be stored/retrieved. Using FDO as a data layer should also be further integrated in Galaxy side rather than in each of the SDR tools (e.g. caching).

SDR tools now have brief descriptions in their Galaxy tool registration. Technical documentation #111 will cover more details of how components shall be used.

Closing for now.

OliverWoolland commented 1 year ago

This functionality is not planned during the current development cycle.

We have identified some challenges with implementing incremental Fair Digital Objects within workflows.

To fully achieve this aim, it is likely that changes would need to be made to Galaxy itself. To allow the specification of FDOs and profiles in a tool's description to have meaningful inputs and output connections, as well as permitting validation.