Evaluation Criteria - Githubissues

jpfairbanks commented 5 years ago

I'm just thinking ahead to the part where we write a paper on this thing, how do we evaluate that it works?

We are kind of defining the problem that we want to solve from scratch, so if we want to make a quantitative comparison we will need to find a published task and show that we are better than a baseline that is already published.

We could also think of writing more of a theory paper, where we define the problem formally and prove some theorems about algorithms that solve the problems. Based on the 3 main use cases we have some potential theorems.

If graph G(P) is constructed according to algorithm A then algorithm F solves the Metamodel construction problem P.
Any valid modification of model M can be generated by running the model in context C.
For a model M and implemented in a function f, with a known region of good parameters R subset of D, A is an algorithm for test if x in R.

If we fill in the details of those terms, and prove the three theorems, I think we have a really strong contribution to the CS literature. It doesn't have a quantitative evaluation comparing two implementations or anything, but it defines a problem and provides a solution.

infvie commented 5 years ago

I think a new way to think of how we can evaluate our model would be to consider the size of the set we can change or the domain of the category theory functions we define? Thoughts?

jpfairbanks commented 5 years ago

So far for Phase 1 we have been focused on qualitative methods of evaluation like examples and case studies.

Some paths to showing our methods work:

Examples of previously known modeling techniques that can be built in ModelTools.Transformations.
- Polynomial Regression is a good example that is a known family of models that we are able to represent as an algebraic structure (monoid) acting on a known model.
- These examples show how our method is able to replicate known results in by computational reasoning over models.
- Other candidates include model selection, kernel methods, ODE to ABM transformations, 1D to 2D ising models.
Proving theorems about contexts or transformations
- We could prove that the modeling algebraic operations on transformations preserve modeling change the models in the way we want them to.
- Applying known algorithms from algebra and geometry to answer questions about classes of models.
- For example we have GCD in polynomial rings, can we use that to do model selection? Given two models and a transformation polynomial ring that acts on that class, can GCD algorithms find the highest order model that reduces to both input models?
Empirical results
- We can analyze several aspects of this project quantitatively including the information extraction and knowledge graph construction, the runtime and compile time performance of each component.
- We can show that the run time cost of our transformed models is similar to a good handwritten version. Cassette runtimes would be interesting to benchmark. And Comparing everything to a nonjulia system would be feasible.

jpfairbanks / SemanticModels.jl

Evaluation Criteria #32