jpfairbanks / SemanticModels.jl

A julia package for representing and manipulating model semantics
MIT License
77 stars 17 forks source link

SemanticModels as inverse of ModelingToolkit #105

Closed jpfairbanks closed 5 years ago

jpfairbanks commented 5 years ago

ModelingToolkit.jl is a framework for building DSLs for expressing mathematical models of scientific phenomena. And so you could think of it as a meta-DSL a language for describing languages that describe models. Their workflow is:

  1. Library author (LA) decides to write a library for solving models of a specific class (C)
  2. LA develops a DSL for representing models in C using ModelingToolkit
  3. ModelingToolkit provides the machinery for processing those model representations into Julia code.
  4. LA develops solvers for models in C
  5. Scientist (S) uses LA's macros to write new models and pass them to the solvers
  6. ???
  7. Publish

This is a great idea and I hope it succeeds because it will revolutionize how people develop scientific software and really benefit many communities.

One of the assumptions of the ASKE program is that we can't make scientists use a modeling language, because the really interesting models are pushing the boundaries of the solvers and the libraries, so if you have to change the modeling language every time you add a novel model, what is the modeling language getting you?

Every solver library introduces a miniature DSL for using that library. You have to set up the problem in some way, pass the parameters and options to to the solver and then interpret the solution. These miniDSLs form through idiomatic usage instead of through an explicit representation like ModelingToolkit provides.

SemanticModels actually can address this as the inverse problem of ModelingToolkit. We are saying, given a corpus of usage for a given library, what is the implicit DSL that users have developed?

Our workflow could be:

  1. identify a widely used library
  2. gather code samples that use that library
  3. process the corpus to build a representation of how that library "should" be used
  4. build a ModelingToolkit DSL for that class of problems
  5. new researchers and AI scientists can use the new DSL for representing the novel models
  6. generate new models in the DSL using transformations that are valid in the DSL.

In this line of inquiry the DSL plays the role of the "structured semantic representation" of the model.

jpfairbanks commented 5 years ago

This brings to mind a Punnett square of modeling frameworks

Science Data Science
Explicit ModelingToolkit Strata
Implicit SemanticModels Data Science Ontology

The explicit tools take structured representations as input and generate code to execute them, the implicit tools take code as input and find a structured representation within it.

jpfairbanks commented 5 years ago

@mehalter and I had a good conversation about this issue today. We talked about how this should be the focus of the package we publish next week. All the automating scientific reasoning stuff is really broad and abstract, but isn't something that we can ship as a polished library/framework for users to actually...use.

On the other hand, the model augmentation at the semantic level is something we can make a self contained package to address. The current workflow is,

  1. Introduce a new class of models to analyze by writing a struct to represent and a parser ingest that class of models
  2. Define a set of transformations that are valid on that class of models
  3. Use SemanticModels functions to implement the parser and transforms.
  4. Write programs that take models (as ASTs) and return novel models (<:SemanticModels.ModelTool.AbstractProblem)
  5. Use SemanticModels.ModelTool primitives to do metamodeling tasks like analyze compositions of transformations and compare new models with old models.

under this workflow SemanticModels is definitely more of a framework than a library, but it is extensible and can be used to take real world modeling code and build a modeling framework around it, rather than building a modeling framework and then porting the models into the framework.