Closed interrogator closed 2 years ago
Hi,
Foremost, I want to thank you for using automathon, I was really excited to know that you were using it. You are absolutely right, I am going to look into the errors you mention and also if you want to help with your solution, I am completely open to receive your help.
Thank you very much, greetings from Colombia
Hey,
Thanks for the fast reply!
I've found a working solution at: https://github.com/lewiuberg/automata/blob/lambda/automata/fa/nfa.py#L154
But this means I currently have to use both libs (that one for the epsilon removal, yours for the visualisation), which isn't much good. And this fix is in a branch of a fork, which is not ideal.
I'm not much of a graph/automaton theorist, so I'm not sure I'll be able to code a solution. But if there is something I can do I'd be happy to try if I've got time.
Thanks again for your help. We're using this code to facilitate a kind of advanced way of searching natural language text, just in case you're curious!
Thank you for the solution!
I'll be working on it tomorrow and hope I can fix it as soon as possible.
I'm very happy that the library is helping you on the research!
Have a good day Daniel, hope you are going well.
I think I just fixed the errors, you can use the new version 0.0.8!
Thank you for report me the error, if there is anything that I can do for you just tell me or email me, all my contact information is on my profile.
Hey, wow, that fix is great, thanks so much for helping so quickly.
There is one other thing I'm trying to accomplish, wondering if you can help.
For example, I generate this NFA with epsilon:
After running the removeEpsilonTransitions()
method, I get:
What I would like to generate instead is the equivalent:
Would you consider coding an option for this, or providing me with some tips so that I can adjust your code and do it myself?
Thanks so much!
It is with pleasure for me!
What you propose is very interesting, I will work on what you say, what I can see is that I must make a minimization of the automaton.
Yes, minimization, that is probably the right term for it. I think you understand what I mean! If you can find a solution, that would be fantastic. I suppose it could be accessed by automaton.removeEpsilonTransitions(minimal=bool)
or even there could be a more general method, automaton.minimize()
which removes duplicate routes where possible. Anything you like would be fine with us! And let me know if there's something I can do to help.
Yes! I was thinking on one general method. I'm looking how to make it
Hey @interrogator I think I found a way to minimize a NFA, only created this method for the NFA class because the theory says that is different when you want to reduce the number of states on DFA, for the moment the minimize
function is only available for NFA. Here is a preview
Before
After
Hope this is what you're looking, if you find any error or anything that I can make for you just let me know.
Have a good day!
Looks great! I will try it out shortly!
OK, I gave it a try and the new algorithm seems to work well!
I think I have just one more issue: When I look inside the .delta
attribute, I see that the node names are stringified lists:
{"['0']": {'AA': ["['1']"]}, "['1']": {'BB': ["['10', '13', '2']"]}
Would it be possible for this data structure to be cleaned up, so that the keys are just 0
and the value's values are either a list or set of ints? Otherwise I have to eval
the data, which is never very nice.
Thanks again for your help, it's really working great!
Great!!
Let me see if I understand what you say.
From this:
{"['0']": {'AA': ["['1']"]}, "['1']": {'BB': ["['10', '13', '2']"]}
to one of these options:
{'0': {'AA': [['1']]}, '1': {'BB': [['10', '13', '2']]}
{'0': {'AA': ['1']}, '1': {'BB': ['2']}
{0: {'AA': [1]}, 1: {'BB': [2]}
Please tell me which one of the options.
It's a pleasure to help you.
I think using sets
might be best where possible:
{'0': {'AA': {'1'}}, '1': {'BB': {'10', '13', '2'}}
Of course I would be fine if integers could be used here, as long as nothing ever gets turned into a string, then I would pass in strings to begin with and end up with:
{0: {'AA': {1}}, 1: {'BB': {10, 13, 2}}
Make sense? Thanks again!
By the way, something else I found useful in a different lib, is the ability to renumber an automaton starting from 0 after minimization. For example, with the pythomata.impl.simple.SimpleNFA
class, you can do nfa.renumbering()
and your nodes start from 0 again. So this would be really nice to have as well.
Hi @interrogator
Didn't reply because I was on an ICPC competition, so excuse me for my absence. I think that for the moment I can change on the algorithm to return the result on this way: {'0': {'AA': ['1']}, '1': {'BB': ['2']}
while I rewrite the code to use sets instead of lists, and I was already thinking on that function, to "fix" the array results on the states, so I think that we are on the same page. Let me work on the renumbering function and then publish the update, and while you work with this update I'll be rewriting the code to use the sets instead of lists.
Hope this is what you're looking, if you find any error or anything that I can make for you, just let me know.
Hey @interrogator,
The renumber method is already implemented, this method doesn't return a NFA, simply changes the labels on the automata when you call the function: automata.renumber()
Hey,
I just got an unexpected error on what should be a very simple graph. Here I catch the error and print out the automaton's __dict__
at error time for you
{'Q': {'3', '2', '1', '0'}, 'sigma': {"token.upos = 'A'", "token.upos = 'C'", "token.upos = 'B'"}, 'delta': defaultdict(<class 'dict'>, {'0': {"token.upos = 'A'": {'1'}}, '1': {"token.upos = 'B'": {'2'}}, '2': {"token.upos = 'C'": {'3'}}}), 'initialState': '0', 'F': {'3'}}
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
/tmp/ipykernel_24489/1628032756.py in <module>
2
3 query = '[upos="A"] [upos="B"] [upos="C"]'
----> 4 data = cquery(query=query, retrieve=True, epsilon=False, minimise=True)
5 data
~/work/cquery/cquery/cquery.py in cquery(db, corpus, id_field, query, prefilter, graph, automaton, epsilon, script, as_dict, retrieve, minimise, verbose)
581 if minimise:
582 try:
--> 583 auto = auto.minimize()
584 except:
585 print(auto.__dict__)
~/venv/py3.9/lib/python3.9/site-packages/automathon/finiteAutomata/nfa.py in minimize(self)
289 def minimize(self):
290 """Minimize the automata and return the NFA result of the minimization"""
--> 291 localDFA = self.getDFA()
292 localNFA = localDFA.getNFA()
293 localNFA.renumber()
~/venv/py3.9/lib/python3.9/site-packages/automathon/finiteAutomata/nfa.py in getDFA(self)
260
261 for t in T:
--> 262 T[t].sort()
263 tmp = T[t].copy()
264 if tmp not in visited:
AttributeError: 'set' object has no attribute 'sort'
This error is because you're using sets instead of lists on the delta variable, you have defined delta like this {'0': {"token.upos = 'A'": {'1'}}, '1': {"token.upos = 'B'": {'2'}}, '2': {"token.upos = 'C'": {'3'}}}
instead of this {'0': {"token.upos = 'A'": ['1']}, '1': {"token.upos = 'B'": ['2']}, '2': {"token.upos = 'C'": ['3']}}
. Actually, I'm rewriting the code to use sets instead of lists, but for the moment the update that I made yesterday was to add the renumber function.
Ah ok, thanks for that!
Hi @interrogator,
No problem, I have changed the use of lists on the delta variable on the NFA's, so you can use it right now like you were trying {'0': {"token.upos = 'A'": {'1'}}, '1': {"token.upos = 'B'": {'2'}}, '2': {"token.upos = 'C'": {'3'}}}
.
Have a good day!
I guess I can close this issue, since all the mentioned problems are solved! Are there any other updates planned?
Hi,
Thanks for the library!
I generate an NFA with the following data and attempt to remove epsilon transitions.
The result is incorrect: in the produced DFA, it becomes possible to jump from A to D without passing through B and C.
Another issue I've found is that if there are two mirroring epsilon transitions, i.e. from 1 to 2 and from 2 to 1, attempting to remove epsilon transitions breaks with a max recursion error.
Any idea what's happening? If need be I could potentially help with a fix!
Thanks