dockstore / dockstore

An app store for scientific workflows, tools, notebooks, and services
https://dockstore.org/
Apache License 2.0
119 stars 27 forks source link

Add Dockstore category via LLM #5938

Open david4096 opened 1 month ago

david4096 commented 1 month ago

Is your feature request related to a problem? Please describe.

This is the list of Categories on Dockstore, though manual curation is slow:

COVID-19
Genomics Toolsets
Alignment
Assembly
Association Tests
Pangenomics
Variant Calling
Long Read Sequencing
Phylogenetics
Single Cell Analysis
File Conversion
Population Genetics
Methylation
Annotation
Viral Genomics
Structural Biology
Microbial Genomics
ChIPSeq
RNASeq

Describe the solution you'd like

Use an LLM (during topic generation) to suggest which category a workflow could be associated with.

https://github.com/dockstore/dockstore-support/tree/develop/topicgenerator

Describe alternatives you've considered

Hand curation :)

Additional context

Came up during BOSC CoFest 2024

┆Issue is synchronized with this Jira Story ┆Fix Versions: Dockstore 2.X ┆Issue Number: DOCK-2550 ┆Sprint: Backlog ┆Issue Type: Story

unito-bot commented 1 month ago

➤ Steve Von Worley commented:

Could also get domain experts to review the current list for gaps and suggest additional Categories, which we could then create (for the LLM to subsequently fill).

unito-bot commented 1 month ago

➤ Steve Von Worley commented:

https://ucsc-gi.slack.com/archives/C16ET3CF4/p1721328429619909 ( https://ucsc-gi.slack.com/archives/C16ET3CF4/p1721328429619909|smart-link )