kylebgorman / pynini

Read-only mirror of Pynini
http://pynini.opengrm.org
Apache License 2.0
118 stars 27 forks source link

FST Visualization #32

Closed chuikova-e closed 3 years ago

chuikova-e commented 3 years ago

Hello, I am using 2.1.1 version

How can I receive visualization result of pynini.transducer("foot", 'feet') in this way? image

instead of that way?

image

kylebgorman commented 3 years ago

If you're simply referring to how it appears in a Jupyter notebook, you need to attach a symbol table to the automaton using the Fst methods set_input_symbols and set_output_symbols. This is purely cosmetic and doesn't change the interpretation of the automata. You also can do this as a decorator or as a context manager using the default_token_type argument.

For publication-quality graphics the Fst method draw has many other options, and also allows you to pass symbol table arguments without attaching them.

On Tue, Oct 27, 2020 at 8:43 AM chuikova-e notifications@github.com wrote:

Hello, I am using 2.1.1 version

How can I receive visualization of pynini.transducer("foot", 'feet') in this way? [image: image] https://user-images.githubusercontent.com/39585591/97303038-eb681900-186a-11eb-845d-a6922c947fc8.png

instead of that way?

[image: image] https://user-images.githubusercontent.com/39585591/97303073-f8850800-186a-11eb-9acd-df581efd6fd9.png

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kylebgorman/pynini/issues/32, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABG4OPQQIZ3NCXBRHGP3Z3SM2555ANCNFSM4TAYVF6A .

chuikova-e commented 3 years ago

Thank you for your help!

taufique74 commented 3 years ago

Previously I was using version 2.0.2 and Jupyter Notebook visualization was like this. This is much easier to work with.

Screen Shot 2021-04-28 at 2 48 30 PM

Currently I'm using version 2.1.3, and I tried this approach to build a symbol table

sigma_star = pynini.union("a", "b", "c", "d", "e").closure().optimize()

table = pynini.SymbolTable()
table.add_symbol('a', 97)
table.add_symbol('b', 98)
table.add_symbol('c', 99)
table.add_symbol('d', 100)
table.add_symbol('e', 101)

sigma_star.set_input_symbols(table)
sigma_star.set_output_symbols(table)

Is this a proper way of creating SymbolTable? And if I want to create FST with words like below, how should I create SymbolTable for that? In general, if I want to achieve the visualizations like previous versions, what modifications should I make?

pynini.union('[one][two][three]')

Any hint would be a huge help, I'm having hard times interpreting this visualization. Sorry If my questions are too basic, I'm new to pynini.

kylebgorman commented 3 years ago

Hi there.

That's a normal way of creating a symbol table, yes. If I'm using the same set of symbols (e.g., all the printable ASCII characters) over and over I would instead store it in a text file and load it whenever I needed it. Symbol tables have a .read_text class method for exactly this purpose.

In your second example, you used the square bracket syntax to create word-sized "generated symbols". These generated symbols are stored in a global symbol table shared across the module. To access that table, use pynini.generated_symbols().

Understand that we don't really optimize for the Jupyter visualization use-case. IMO state diagram visualization, isn't feasible beyond toy examples---the machines become unreadable very quickly as the problem becomes complex.

K

On Wed, Apr 28, 2021 at 5:08 AM Taufiquzzaman Peyash < @.***> wrote:

Previously I was using version 2.0.2 and Jupyter Notebook visualization was like this. This is much easier to work with.

[image: Screen Shot 2021-04-28 at 2 48 30 PM] https://user-images.githubusercontent.com/7470463/116375480-230f4800-a831-11eb-8f5e-10515d0d6bbd.png

Currently I'm using version 2.1.3, and I tried this approach to build a symbol table

sigma_star = pynini.union("a", "b", "c", "d", "e").closure().optimize() table = pynini.SymbolTable()table.add_symbol('a', 97)table.add_symbol('b', 98)table.add_symbol('c', 99)table.add_symbol('d', 100)table.add_symbol('e', 101) sigma_star.set_input_symbols(table)sigma_star.set_output_symbols(table)

Is this a proper way of creating SymbolTable? And if I want to create FST with words like below, how should I create SymbolTable for that? In general, if want to achieve the visualizations like previous versions, what modifications should I make?

pynini.union('[one][two][three]')

Sorry If my questions are too basic, I'm new to pynini.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kylebgorman/pynini/issues/32#issuecomment-828289684, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABG4OIUEFHMRIH6DERZTRDTK7F77ANCNFSM4TAYVF6A .

taufique74 commented 3 years ago

Thanks so much! Passing pynini.generated_symbols() to set_input_symbols() solved my issue.

kalvinchang commented 3 years ago

Is there a way to access the integer labels of an FST created without a specific symbol table: pynini.accep('abcdef') image I understand that the default token type is a byte, and I can just get integer labels by getting the ASCII number of the letters, but is there a better way?

For context, I'm trying manually create an arc using pywrapfst.Arc(ilabel, olabel, weight, nextstate) but the input label and output label must be integers. Should I just define a symbol table and get the label from the table, or is it possible to access the integer labels through Pynini or pywrapfst functions?

Thanks!

kylebgorman commented 3 years ago

I would define a symbol table or, in the above case, use Python's built in chr and ord functions to convert between Unicode and integers. The integers used in the default byte mode are just the corresponding bytes in base-10; the integers used in utf8 mode are the corresponding Unicode codepoints in base-10.

kalvinchang commented 3 years ago

thanks!!