[WIP] Making standard XNMT Graph interface

philip30 commented 5 years ago

These are something that I am working on now. I want to merge this to avoid further code conflict/redundancy in the future.

Basically, I am working on:

Making shift-based RNNG based parser, I already have the reader for CONLL format and working examples. Though the parser is still work in progress
Making 1 package for simultaenous translation experiment. This package has its own translator and search strategies.
Some fixes to the previous merge.

msperber commented 5 years ago

Hey @philip30 , I was wondering if it would be possible to separate this out into the code that's WIP and the code that's ready to be used and only send a PR for the latter? I think at this stage in the project it would be best to keep the master branch clean from unfinished code as much as possible because it might be confusing and difficult to maintain for others. I understand that you want to avoid merge conflicts of course. I have recently followed a strategy of keeping all my working code in separate files, and only make PRs for my changes to the files that are already in the master branch. That has worked pretty well for me and avoids most merge conflicts.

philip30 commented 5 years ago

Hey @msperber, thank you for you suggestion. But this is actually working code, and I believe @armatthews is going to implement some RNNG as well? Because I already did that, it'd be best if we make a common ground by merging the working code to master.

msperber commented 5 years ago

Ok, if this is all functional code we should work toward merging it. Based on some quick skimming, a couple of comments:

graph structure: I've already implemented a data structure in sent.Lattice which looks very similar. It would be better to merge the two, either by renaming lattice to graph in case they share the same functionality, or by making lattice a subclass of graph or something. In any case, graph should be a subclass of sent.Sentence and implement the interface accordingly.
simultaneous translation: I saw that you included both unit and integration tests which is great. But I couldn't find any description on what this feature is actually doing. Most of XNMT is documented quite well by now, could you add docstrings and type annotations similarly here? Also, it would be important to have some high-level explanation, e.g. by writing a description in the test config file, or in the module-level docstring, or some other place in the documentation.
RNNG: it would probably make sense for @armatthews to comment on this?

armatthews commented 5 years ago

I have also already implemented RNNG. See my syntax branch. My implementation is mostly done, and it's trainable, though we can't decode with it yet.

I'm happy to work towards a merge, but I agree with keeping unfinished stuff in the branches.

philip30 commented 5 years ago

@armatthews I can't see your RNNG in the syntax branch. Maybe you haven't pushed it yet? The latest commit was the part when you read in the syntax tree. It is good that we have two different types of reader, (yours reading the syntax tree while mine reading the CONLL format)

armatthews commented 5 years ago

My RNNG code is only on my own branch. This link should take you directly to the RnngDecoder implementation.

armatthews commented 5 years ago

It looks like your implementation of RNNGs is for dependency trees whereas mine uses the phrase-structure tree formulation, so they may be complimentary, though we might need to clean up the names a bit.

I like that you generalized things by allowing general (hyper-)graphs. That seems like it could unify several of these different syntax (or other!) structures together.

Aside from the RNNG stuff, I see that you added a last_model_scores member to the scorer, which concerns me a bit. In general, I think that we should try to keep objects stateless so that things don't get tricky when operating in parallel or on non-linear structures. Is it possible to move the last_model_scores into the SimultaneousTranslator class, rather than making it part of the scorer?

Having the simultaneous translator model is really nice. It seems you also put a lot of work into improving the RL training, which I hope I can put to good use soon :)

philip30 commented 5 years ago

@msperber I don't agree with what you said that graph should implement the sentence interface. Here, the graph is just an intermediate representation of the CONLL input (which is list of lines), and the form I need is just a sequence of {shift, nt, reduce} action, a much simpler version of the graph. So I'd implement RNNGActionSentence instead of the GraphSentence.

Also, my graph implementation is based on the list of edges, whereas your lattice implementation is based on direct DAG implementation from object pointing to other objects? I think they are not very compatible, unless we want to turn the lattice implementation into the list of edges implementation.

philip30 commented 5 years ago

@armatthews I take a look of your implementation and it is very good. Maybe I can try to merge the work together by pulling your branch and work it on my branch and push it here? We can, of course, talk about the way of merging it.

msperber commented 5 years ago

I understand your point @philip30, and also think that it would be good to implement RNNGActionSentence.

The main reason why I think it would be great to unify the data structures is that there are many ways to use graphs in neural models by now, and it would be cool to be able to reuse data structures, input readers, etc. across different places. For example, the currently implemented LatticeLSTM is actually just a TreeLSTM that is bidirectional and happens to support lattices as inputs. By unifying our data structures, we would immediately have support for tree2seq models in XNMT. The TreeLSTM doesn't support hypergraphs, but that could also be added in the future if needed.

It would be perfectly fine to adjust the currently implemented lattice data structure though. The only requirements are:

loopable nodes stored in topological ordering: https://github.com/neulab/xnmt/blob/master/xnmt/transducers/lattice.py#L87
access to predecessor nodes for each node: https://github.com/neulab/xnmt/blob/master/xnmt/transducers/lattice.py#L96
support to reverse the graph: https://github.com/neulab/xnmt/blob/master/xnmt/sent.py#L499
three different versions of confidence scores assigned to nodes (we would probably simply use different subclasses for a general GraphNode interface): https://github.com/neulab/xnmt/blob/master/xnmt/sent.py#L399

I don't know much about the requirements on your part, but do you think it would be possible to unify things?

A positive side effect of merging our code would also be that you could plot trees using the code I already implemented: https://github.com/neulab/xnmt/blob/master/xnmt/sent.py#L545

philip30 commented 5 years ago

Alright, I will be working on it :)

philip30 commented 5 years ago

@msperber I already merged the Lattice with the graph interface. Can you take a look at the design? Also, I still have an error in the topological sort part, in the transducers/lattice.py. I am not sure which part is wrong? I guess the error comes when it tries to topo sort the reversed lattice?

But I have made sure that the graph operations are valid (as seen in the unittest). Your help is much appreciated! Thank you

philip30 commented 5 years ago

I hacked and fixed it :), I think it's important to store the topo_sort results first before using that.

msperber commented 5 years ago

Cool, this looks very nice! So in the end we might have a CoNLLTreeReader that outputs either GraphSentence or RNNGActionSentence objects, and a LatticeReader that outputs LatticeSentence objects?

Actually, it might be possible to get rid of the special LatticeSentence class. I think the main reason for it is to be able to reverse the scores for each node whenever the graph is reversed, but we could simply implement HyperNode.reverse() that returns self by default, or returns a node with reversed probs in the case of LatticeNode. When reversing the graph, it would then simply add a call to reverse() for each node.

BTW I also have some more graph-related code which I want to merge at some point (though probably not within the next couple of weeks), so I'll also take a look at whether things can be further simplified after this PR has been merged.

philip30 commented 5 years ago

@msperber Thanks! I have a question.

Actually, it might be possible to get rid of the special LatticeSentence class. I think the main reason for it is to be able to reverse the scores for each node whenever the graph is reversed, but we could simply implement HyperNode.reverse() that returns self by default, or returns a node with reversed probs in the case of LatticeNode. When reversing the graph, it would then simply add a call to reverse() for each node.

Would this be over-simplify things? So you don't care about the direction of the edge? I think it is also a bit strange to have a reverse() operation in a graph...

msperber commented 5 years ago

Hm, no I didn't mean to get rid of the reverse operation, but it's currently implemented in GraphSentence and then overridden in LatticeSentence, and I was wondering if we could get rid of the latter class altogether.

The above was just a suggestion, but I think if you'd prefer it'd also be fine to keep both GraphSentence and LatticeSentence classes for now, as this should already be enough to support encoding trees (in the form of GraphSentence) using the LatticeLSTM transducer.

philip30 commented 5 years ago

@msperber I get rid of the classes and did revision based on your suggestion. I hoped it looks fine now? After this, I will pull the code from @armatthews repo and try to merge the RNNG works.

msperber commented 5 years ago

Yes, this looks great, thanks!

philip30 commented 5 years ago

I am merging right now. Currently, there is a bunch of undocumented code in SimultaneousTranslator, yet it is tested. I will soon follow up with documentation when things are less busy.

neulab / xnmt

[WIP] Making standard XNMT Graph interface #564