ISA-tools / stato

This is the development repository for the STATistics Ontology (STATO). For more information and demonstration on the ontology content, please visit its website:
http://stato-ontology.org/
30 stars 8 forks source link

annotating a data science workflow with STATO #82

Closed realmarcin closed 8 months ago

realmarcin commented 3 years ago

Hello, Thanks for creating a great resource!

I am wondering whether STATO could be used to annotate data science workflows including a set of data objects and their transformations. To take a simple case, starting from gene expression samples: 1) Merge sample vectors into a matrix. 2) Perform some normalizations like column standardization. 3) Compute pairwise sample correlations. 4) Compute significance (FDR) and apply threshold. 5) A set of gene pairs associated via gene co-expression.

I'm having a little trouble finding some of the relevant terms but also want to ask this question more broadly -- is workflow annotation a use case for STATO?

best, marcin

proccaserra commented 3 years ago

@realmarcin thx for the kind words, much appreciated STATO can definitely be used to do that. In fact, we have used it with collaborators to annotate Galaxy workflows. We mainly used STATO to identify the statistical tests being used, have users to report the input (e.g. alpha value) and then to annotate the resulting data matrices generated by these analytical workflows. STATO can also be to associated a particular data output / workflow with a suitable graphical rendering (e.g. Manhattan plot for GWAS data). If terms are needed, we can setup a robot template for that.

Happy to follow-up the discussion and elaborate on the use case.

all the best.