Symbol and MLN representation

hpoit commented 7 years ago

Hello @tawheeler, @mykelk, @sbromberger, and @abhijithch. Right now this is what I have to represent

Source http://homes.cs.washington.edu/~pedrod/papers/mlj05.pdf

I would appreciate some help getting started with LightGraphs.jl. Will read thoroughly about it now.

sbromberger commented 7 years ago

Frankly, this doesn't look like a straightforward application of either, LightGraphs or Graphs.jl. For LightGraphs, you'd need to store the node->variable information in a map of some sort, which is not memory efficient, and then you'd have to figure out how to label the cliques and map the potential functions to each. I don't know what the state of first-order functions is in Julia these days, but in the old days this would probably be very inefficient.

I can't speak to Graphs.jl, but my (very limited) experience with it suggests you might have similar issues with respect to the potential functions. It can handle the vertex mapping much better than LG since it doesn't really care what type of object you use to represent vertices.

hpoit commented 7 years ago

@sbromberger, I am trying to make sense between your comment and these sections of the Tuffy manual

hpoit commented 7 years ago

I also found these parts on MLN from Helene's paper useful, but not complete

tawheeler commented 7 years ago

You almost need to write a compiler! I actually wrote a very non-professional implementation of log-linear models + training for a paper. I had the very issues that @sbromberger mentioned. Feel free to check out this repo and/or read the paper.

hpoit commented 7 years ago

@tawheeler that's a kickass paper, will take it in tonight, thanks.

@sbromberger thanks for the input, could you expand on first-order logic in Julia? I couldn't find a package like http://stackoverflow.com/questions/2304726/first-order-logic-engine for Python or the Markov Logic option from Tweety in Java http://tweetyproject.org/lib/index.html Should these be present even with a graphs package? I will look into the mapping of potentials efficiency issue, thank you!

sbromberger commented 7 years ago

not first-order logic, first-order functions. The issue is (was?) that passing functions as arguments to other functions is very inefficient in Julia. This may have changed recently; I haven't been following developments. See https://github.com/timholy/FastAnonymous.jl for more details.

hpoit commented 7 years ago

Thanks @sbromberger. Sorry for the mistake.

hpoit commented 7 years ago

@tawheeler, so you efficiently mapped the potentials with a Gaussian prior over the log-linear parameters θ, using SGA and momentum to speed up learning? Are the types of objects to represent vertices an efficiency issue?

Can you expand on 'you almost need to write a compiler'? I will if that's what it takes to do it correctly.

tawheeler commented 7 years ago

Factor.jl contains the factor functions. I avoided using first-order functions and made types instead. Nevertheless, this led to issues when iterating over vectors of these types, as these vectors were necessarily abstract. See here. I ended up hacking it using explicit if statements to get it to run fast.

In the cast of MLNs you won't have a small explicit set of factors, but will want to support a large, general set. If each vertex stores a first-order function you will end up with type instability. Julia 0.4 doesn't have a way to enforce return types, so there will be memory allocation and error checking in every call to the first-order functions. There may be a workaround as of Julia 0.5.

I mention compiling because you will likely need to parse input files like in the examples you posted above. !hasWord(d, NICE) v review (d, POS) would have to be interpreted and then parsed into a representation the MLN can understand. I suppose compiling isn't the right word - really just lexing and parsing. This process builds the symbol tables they mention.

hpoit commented 7 years ago

Thanks Tim. Could I maybe work the first-order functions with if statements and typeassert to avoid type instability? Thanks for expanding on the compiling issue, it sets the work to be done. Is JuliaParser.jl the way?

Hi @johnmyleswhite, I am reading the book-long thread on return types, and since you summarized it so well, would you have any suggestions on keeping the mapping of first-order harmonic functions type-stable?

@StefanKarpinski and @JeffBezanson, your suggestions are also welcome.

hpoit commented 7 years ago

Hello @jongwook, have you read this? https://hal.inria.fr/file/index/docid/820383/filename/PapadopoulosTzanetakis_MLN_ICASSP_2013.pdf

jongwook commented 7 years ago

Hi @hpoit, unfortunately no, I'm not very experienced with MLNs.

hpoit commented 7 years ago

Hello @dpsanders, MLN can learn from all your research interests. I am taking in all I can on Julia, including your very complete and explicit Julia material, to get this repo started. I am trying to build a flight plan first to figure out how much time I will need to get it done. Suggestions are welcome. Cheers.

hpoit commented 7 years ago

@sbromberger, label inference may solve the problem for cliques, will check.

For mapping variables of parents to nodes, instead of a table specifying the node's conditional probability for every state of its parents, a simpler distribution can be learned. The most popular choice is a probabilistic version of the logical OR operation, i.e. any cause alone can provoke an effect, with each cause having a certain probability of failing to do so. Heckerman and others have learned Bayesian nets that diagnose hundreds of infectious diseases this way. Google uses this type of BN in Adsense to learn from one hundred billion text snippets and search queries.

Now I still have to research on how to solve type stability.

hpoit commented 7 years ago

When you mentioned compiling, I think it was the right word http://www.cs.cornell.edu/courses/cs412/2008sp/lectures/lec12.pdf

hpoit / MLN.jl

Symbol and MLN representation #2