We do not have large scale datasets that we can use to evaluate GoLLM tasks
We do not have a method in place for sourcing or creating datasets for new GoLLM tasks
Approach
Create distributions that we can sample from to create synthetic datasets. For example, we can likely create an arbitrary AMR, stratify it and then create synthetic interaction matrices which have cells that map to the newly created AMR.
A dataset of real data will be better where applicable, but we will have significant hurdles to overcome in terms of annotation costs, licensing, and time spent.
Problem
Approach
Tasks
TBD