facebookresearch / gtn

Automatic differentiation with weighted finite-state transducers.
MIT License
452 stars 40 forks source link

What is the difference between this work and K2? #22

Closed by2101 closed 3 years ago

xingchensong commented 3 years ago

I think this paper (ICLR 2021 under double-blind review) is related to this repo...

awni commented 3 years ago

Thanks for the question. There is in fact a lot of overlap. In particular the high-level vision: seamlessly and efficiently using WFSTs with automatic differentiation is virtually the same. The two frameworks are being developed simultaneously and for most of their life without knowledge of one another. We only learned of K2 a couple of weeks ago :).

We started developing an earlier version of GTN in January 2020 and it looks like K2 started around the spring of 2020 (?) so on a similar time frame.

However, I am sure there are and will be some differences in the API and the implementation. The details remain to be seen pending the release of K2.

The linked ICLR submission is indeed related to this framework. The preprint is also available on arXiv.

csukuangfj commented 3 years ago

@awni

I find the graph in Figure 2 (e) of the paper quite confusing. The result of composing the token graph and the label graph is an FST with ilabel letters and olabels words. However, the emission graph is an FSA with label letters. How can you compose an alignment graph and an emission graph?

I find in the example code that you are actually converting the alignment graph to an FSA before doing composing. https://github.com/facebookresearch/gtn/blob/f7900f41edc65843515b1d2f02d6fa40846e33a4/bindings/python/examples/word_decompositions.py#L81-L84

Are you omitting something in the paper?

awni commented 3 years ago

Oh right, thanks for catching that. It should probably be E_x \circ A \circ Y. For simplicity of notation we use \circ to be composition or intersection. Also if you have more questions about the paper specifically please email me. Let's keep this thread focused on the topic at hand e.g. differences between K2 and GTN.