User interacton - Githubissues

jpfairbanks commented 5 years ago

How the heck are people going to use the grafting functionality?

Ingest a set of models models = ingest("src1.jl", "src2.jl"...)
Refine types to establish every concept gets its own type models = refine(models, tcallback)
generate AST based graph showing expressions and variables astgraph = MetaGraph(parse(models))
user identifies a subgraph of nodes to start with in the working graph wm = astgraph[vs,vs]
user applies transformations to nodes from AST based graph and the working graph wm = apply(wm, transforms...)
func = SemanticModels.generate(wm)
func(args...)

crherlihy commented 5 years ago

@jpfairbanks and I chatted about this set of tasks today. As a starting point, I'm going to work on developing a method that ingests a set of julia programs, parses these programs to build a type graph per program, and (for each type graph), returns a set of high-degree vertices with their associated incoming and outgoing edges; the intuition here is that high-degree vertices in the type graph are the most probable candidates for semantic disambiguation via a refinement of the type system. Ideally the next step here will be to develop stubs and let user provide missing items (e.g., struct names, kwargs, etc.). We also discussed graph homomorphism; our objective to ensure tractability will be to develop a method that ingests a program and a set of valid transformations; given a sequence of these transformations, ideally we'll be able to certify that the resulting graph is homomorphic. Also a need to clarify whether we mean homomorphism at the type level (if !, stop there) vs. at the value level (required for program transformation level, but not at the category (eg, type graph) level.

infvie commented 5 years ago

I'm going to work on developing a method that ingests a set of julia programs, parses these programs to build a type graph per program, and (for each type graph), returns a set of high-degree vertices with their associated incoming and outgoing edges

@jpfairbanks and @crherlihy is christine taking over on what I am doing?

crherlihy commented 5 years ago

@infvie nope; if there's sig. overlap, we can chat and I'll revise scope here. My intention was picking up where your function (eg type graph construction) leaves off. I just meant my function will call yours. Sorry for confusion.

jpfairbanks commented 5 years ago

system architecture and which pieces we have in mind for which use cases would be really helpful. specifically:

what types of data are we ingesting?

the input data is a model as defined in a script which contains 1 top level module and a main function. The default name for the main function is main but you could pass in a different name if you wanted to.

example

module Foo
using Roots
a = 3
b = 4
c = -1
f(x,y) = c*x*y + a*x + b

function main()
    x0 = [0.0,0.0]
    xstar = RootProblem(f, x0)
    @show xstar
end
end #module

what type of graph do we create from each type of data ingested?

All the data is the same, this script that contains a module and a main function. The graphs we can generate are:

Vertices are types, edges are functions between types
Vertices are functions and variables, edges represent dataflow, function references variable or function calls function.

From any piece of text, including a corpus of files we can also make a knowledge graph:

Conceptual knowledge graph from text, vertices are concepts edges are relations between concepts.

to what extent do we expect these graphs to be linkable?

Between different scripts we should be able to link the graph by defining an alias relation that says "these vertices are equivalent" and then merging the graphs.

With a script we should be able to merge the types of graph by converting the type graph into its pseudodual. The pseudodual is constructed by take a type-function graph and constructing a new graph where functions and types are both vertices, if U = typeof(f(::V)) then there is a pair of edges V -> f -> U and there are edges for the functions getindex(u::U, v::V) ie (U, V) -> getindex -> typeof(u[v]) for all the values of v. These represent "untupling" and accessing fields of structs.

which graphs do we need for each use case.

Model Augmentation: we need dataflow, types, and concepts for implementing the "frontend" of ModelTool. This is an informative step to show a person extending SemanticModels how to implement ingestion for a new class of models. Once we have the new class of models implemented, we only need Exprs and do not necessarily need the KG to do model augmentation.
Metamodel construction: We need the type graph for program refinement and the dataflow and concept graphs to do the metamodeling reasoning. This part will probably leverage all the graphs at run time when solving for the combined model.
Model Validation: we need the structured representation of the model that is used in model augmentation and the trace of execution that follows the same lines as the traces used to build the dataflow and type-function graph. I don't think this needs the KG directly unless we find that we can build better models with the KG than with the trace. I think the trace is more useful because it is hierarchical and DNNs work better on trees than on general graphs.

jpfairbanks commented 5 years ago

So this turned into the general model augmentation API and the typegraph API which is much easier to think about than grafting. API for that is documented with the examples and the #85 covers the interactions between the knowledge graphs.

jpfairbanks / SemanticModels.jl

User interacton #87