wader / libfa

C automata library to build, determinize, minimize, translate regexp etc
Other
10 stars 3 forks source link

Speed up determinization #2

Open huhuhugo1 opened 6 years ago

huhuhugo1 commented 6 years ago

Hey, I need your help.

I have two sets of regular expressions and after union, NFA for first set have 380 states, 453 transitions and NFA for second set have 241 states, 264 transitions.

The problem is, that determinization of second NFA takes very long, much longer, than first NFA, which is bigger and REGEXs are more complex. Do you have any idea, how can I speed up the process of determinization?

Thanks.

wader commented 6 years ago

Hello, sorry for the late reply! for reason i don't see new issues in my feed.

Could you give some example of sets of regexps that causes this problem? one issue i've seen quite often is that one or more regexps are unbounded which causes their states to "spread" when determinizing. For example make sure to anchor ^...$ (if not there is a implicit .* before and after) and use {min,max} instead of * or + which are unbounded.

wader commented 6 years ago

@huhuhugo1 any progress?