Model Validation with Tree LSTM

jpfairbanks commented 5 years ago

We should try tree based LSTMs on the code traces because they are inherently tree based. One problem with LSTMs is that they require use to serialize the code down to a sequence. But we have an inherent tree structure in our models.

jpfairbanks commented 5 years ago

This model from the model-zoo shows how to do it for treebank. https://github.com/FluxML/model-zoo/blob/master/text/treebank/recursive.jl

jpfairbanks commented 5 years ago

@gallupBenRyan I think this is a very good architecture to adapt. https://nlp.stanford.edu/sentiment/treebank.html it covers recursive sentiment in each subphrase of the tree. In our case the subtree is the trace of a subfunction and this really captures our notion that a function call f(args...) is valid if all of its subcomputations are valid, and invalid if any of its subcomputations are invalid.

Per our phone conversation today, I don't think it will be easy to serialize the program trace trees, so we should generate the traces/trees and the treeRNN model in the same notebook.

jpfairbanks commented 5 years ago

@gallupBenRyan, for storing the trace, I think we can go to julia expressions as a serialization format.

For example the following code will serialize arrays into ASTs that can represented as strings such that Base.Meta.parse will recover the same AST. Then you can eval the AST to get the actual object.

julia> express(x::Vector) = begin
       ex = Expr(:vect)
       for val in x
           push!(ex.args, val)
       end; return ex
       end
express (generic function with 1 method)

julia> express([1,2,3])
:([1, 2, 3])

julia> eval(express([1,2,3]))
3-element Array{Int64,1}:
 1
 2
 3

julia> sum(eval(express([1,2,3])))
6
julia> string(express([1,2,3]))
"[1, 2, 3]"

julia> Base.Meta.parse(string(express([1,2,3])))
:([1, 2, 3])

julia> eval(Base.Meta.parse(string(express([1,2,3]))))
3-element Array{Int64,1}:
 1
 2
 3

jpfairbanks commented 5 years ago

Here is another example with some more assignments and stuff.

julia> ex = quote 
       a = [1,2,3]
       b = sum(a)
       c = b .- 1
       d = sqrt(c)
       end
quote
    #= REPL[40]:2 =#
    a = [1, 2, 3]
    #= REPL[40]:3 =#
    b = sum(a)
    #= REPL[40]:4 =#
    c = b .- 1
    #= REPL[40]:5 =#
    d = sqrt(c)
end

julia> string(ex)
"begin\n    #= REPL[40]:2 =#\n    a = [1, 2, 3]\n    #= REPL[40]:3 =#\n    b = sum(a)\n    #= REPL[40]:4 =#\n    c = b .- 1\n    #= REPL[40]:5 =#\n    d = sqrt(c)\nend"

julia> Base.Meta.parse(string(ex))
quote
    #= none:3 =#
    a = [1, 2, 3]
    #= none:5 =#
    b = sum(a)
    #= none:7 =#
    c = b .- 1
    #= none:9 =#
    d = sqrt(c)
end

julia> eval(Base.Meta.parse(string(ex)))
2.23606797749979

jpfairbanks commented 5 years ago

This might actually be the solution, but I don't know why this works.

julia> eval(Base.Meta.parse(string(express([(+, (2,3))=>[(-, 3)=> (*,(2,8))]]))))
1-element Array{Pair{Tuple{typeof(+),Tuple{Int64,Int64}},Array{Pair{Tuple{typeof(-),Int64},Tuple{typeof(*),Tuple{Int64,Int64}}},1}},1}:
 (+, (2, 3)) => [(-, 3)=>(*, (2, 8))]

julia> string(express([(+, (2,3))=>[(-, 3)=> (*,(2,8))]]))
"[(+, (2, 3)) => Pair{Tuple{typeof(-),Int64},Tuple{typeof(*),Tuple{Int64,Int64}}}[(-, 3)=>(*, (2, 8))]]"

julia> ex = express([(+, (2,3))=>[(-, 3)=> (*,(2,8))]])
:([(+, (2, 3)) => Pair{Tuple{typeof(-),Int64},Tuple{typeof(*),Tuple{Int64,Int64}}}[(-, 3)=>(*, (2, 8))]])

julia> dump(ex)
Expr
  head: Symbol vect
  args: Array{Any}((1,))
    1: Pair{Tuple{typeof(+),Tuple{Int64,Int64}},Array{Pair{Tuple{typeof(-),Int64},Tuple{typeof(*),Tuple{Int64,Int64}}},1}}
      first: Tuple{typeof(+),Tuple{Int64,Int64}}
        1: + (function of type typeof(+))
        2: Tuple{Int64,Int64}
          1: Int64 2
          2: Int64 3
      second: Array{Pair{Tuple{typeof(-),Int64},Tuple{typeof(*),Tuple{Int64,Int64}}}}((1,))
        1: Pair{Tuple{typeof(-),Int64},Tuple{typeof(*),Tuple{Int64,Int64}}}
          first: Tuple{typeof(-),Int64}
            1: - (function of type typeof(-))
            2: Int64 3
          second: Tuple{typeof(*),Tuple{Int64,Int64}}
            1: * (function of type typeof(*))
            2: Tuple{Int64,Int64}
              1: Int64 2
              2: Int64 8

jpfairbanks commented 5 years ago

function express(x::Vector)
       ex = Expr(:vect)
       for val in x
           push!(ex.args, express(val))
       end
      return ex
end

function express(x::Any)
    return x
end
function express(x::Pair)
   ex = Expr(:call)
   push!(ex.args, express(x.first)
   push!(ex.args, express(x.second)
   return ex
end

function express(x::T) where T # we need a way to say where T is a struct
    return quote(x)
end
end

jpfairbanks commented 5 years ago

We need to handle structs like this

julia> struct Foo
       a::Int
       b::Int
       end

julia> dump(express([Foo(1,2)]))
Expr
  head: Symbol vect
  args: Array{Any}((1,))
    1: Foo
      a: Int64 1
      b: Int64 2

But create an expression like


julia> dump(:(Foo(1,2)))
Expr
  head: Symbol call
  args: Array{Any}((3,))
    1: Symbol Foo
    2: Int64 1
    3: Int64 2

jpfairbanks / SemanticModels.jl

Model Validation with Tree LSTM #95