[Proposal] New DFA Type

caleb531 / automata

A Python library for simulating finite automata, pushdown automata, and Turing machines

https://caleb531.github.io/automata/

MIT License

349 stars 64 forks source link

[Proposal] New DFA Type #157

Closed eliotwrobson closed 1 year ago

eliotwrobson commented 1 year ago

There has been some discussion about the memory usage of the DFA and ways of remedying this. I was glancing at the literature and saw a paper discussing a delayed DFA (or D2FA). References below:

Bypassing_Space_Explosion_in_High-Speed_Regular_Expression_Matching.pdf 2306.12771.pdf

This essentially uses a generalization of partial DFAs to reduce the number of transitions over large alphabets. It can be constructed from a DFA using a number of different algorithms (detailed in the above papers), and if we implement this as a class in this library, we can use some more space-efficient data structures than for vanilla DFAs. Memory efficient algorithms like enumeration of the language with wall-following should be implementable within this framework as well.

To clarify: This would not be for the v8 release, but after that is done and we're able to get feedback on the changes made there.

caleb531 commented 1 year ago

@eliotwrobson First thought: D2FA = Deterministic 2-Factor Authentication 😜

What are the behavioral semantics of a D2FA? Do they accept different parameters than a normal DFA? Does it read input the same way?

I ask all these questions because otherwise, it seems like the memory efficiency gain would be more of an implementation detail that could be integrated into the existing DFA class—as opposed to something that is worthy of standing as a separate entity conceptually (at least for this library).

eliotwrobson commented 1 year ago

@caleb531 the D2FA differs significantly in it's mathematical description from a DFA, in a more significant way than just a partial DFA, and has to be constructed from a DFA using a pretty technical algorithm. Input is read in a slightly different way, but it's different enough that it warrants creating a separate class rather than adding onto the DFA itself (even without more specialized memory efficient data structures).

So in essence, the proposal here is twofold: Making a D2FA class, and implementing this class using more efficient data structures at the cost of less flexibility. The second item might not be the right direction, but the first is at least interesting.

caleb531 commented 1 year ago

I understand now, thank you for the explanation! Then yes, I'd say you are welcome to try your hand at implementing a D2FA class. I probably wouldn't merge the PR until after a v8 release, so I leave that up to you regarding the timing of this work.

eliotwrobson commented 1 year ago

Going to mark this as not planned, not necessarily a bad idea or anything but I think the uses for this will be very niche, so development time is better spent elsewhere for now. Will reopen in the future if there is interest / demand.