kylebgorman / pynini

Read-only mirror of Pynini
http://pynini.opengrm.org
Apache License 2.0
118 stars 27 forks source link

how to create empty FST #40

Closed kalvinchang closed 3 years ago

kalvinchang commented 3 years ago

how can i create an empty FST with pynini without resorting to pywrapfst.Compiler().compile()

i've tried pynini.difference('', '') but this results in wacky behavior similar to the one in #12:

pynini.compose('a', pynini.difference('', '').concat('a')).num_states() should be nonzero since 'a' should be accepted but returns 0

kylebgorman commented 3 years ago

Just use pynini.Fst.

kalvinchang commented 3 years ago

How can i create an empty FST that does accept the empty string?

fst = pynini.Fst()
s = fst.add_state()
fst.set_start(s)
fst.set_final(s)

fst.concat('a')

accepts 'a' as needed but also accepts the empty string

However, if I do not set s to be the final state, fst.concat('a') does not accept a as expected.

kylebgorman commented 3 years ago

Yeah, what you did there looks fine to me:

    f = pynini.Fst()
    s = fst.add_state()
    f.set_start(s)
    f.set_final(s)
    return f

The only alternative I see is pynini.accep("") or something like that. That should work though I kind of like the explicit approach taken above better.

kalvinchang commented 3 years ago

Thank you for your reply! I will keep that in mind. However, I miswrote, sorry!! I meant to ask: How can i create an empty FST that does NOT accept the empty string?

kylebgorman commented 3 years ago

pynini.Fst() does not accept an empty string, nor does, say pynini.union("foo", "bar"). I am not sure I understand your question though.

kalvinchang commented 3 years ago

I guess I wanted to create an empty FST that when concatenated to a nonempty FST, still accepts what it is supposed to accept. For example,

empty = pynini.Fst()
empty.concat('a')
print(pynini.compose('a', empty).num_states()) # 0; does not accept 'a'
kylebgorman commented 3 years ago

I guess I wanted to create an empty FST that when concatenated to a nonempty FST, still accepts what it is supposed to accept.

Can I ask: why?

kalvinchang commented 3 years ago

I am trying to convert (limited types of) regex to an FSA. For example, the regex "..*" should not accept empty strings. My current approach is to iterate through each symbol and concatenate the FSTs corresponding to each symbol. The FST that I start off with should not accept empty strings.

kylebgorman commented 3 years ago

The concatenation approach makes sense for me but you'll have the special-case the creation of the "first" FST, as far as I see it. Instead of concatenating it with a null FST you'll have to just use it as the root FST.

On Wed, Apr 21, 2021 at 3:06 AM Kalvin Chang @.***> wrote:

I am trying to convert (limited types of) regex to an FSA. For example, the regex "..*" should not accept empty strings. My current approach is to iterate through each symbol and concatenate the FSTs corresponding to each symbol. The FST that I start off with should not accept empty strings.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/kylebgorman/pynini/issues/40#issuecomment-823830012, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABG4OON6WSSGCF4ZU7HHVLTJZ2N3ANCNFSM42E2BWQA .

kalvinchang commented 3 years ago

that makes sense. i will try that. thank you!