caleb531 / automata

A Python library for simulating finite automata, pushdown automata, and Turing machines
https://caleb531.github.io/automata/
MIT License
349 stars 64 forks source link

Add `min_length` and `max_length` parameters to `successors`/`predecessors` #207

Closed Tagl closed 4 months ago

Tagl commented 9 months ago

I think the successors/predecessors functions are quite useful for enumerating words that are accepted by the DFA. Sometimes I want to limit the length of the word in an infinite language. So far I have been limiting lengths by intersecting with DFA.of_length.

It might be a good idea to add parameters for this, as it may be far more efficient to take apply that approach than to intersect with another DFA. One additional benefit of this is it will guarantee that execution of the function ends, which is not true right now for infinite languages. We might even want to require max_length or default to some reasonable upper bound based on the size of the alphabet. I'm planning on changing the implementation myself to support this in the near future, assuming it is beneficial with regards to performance.

@caleb531 @eliotwrobson do you have any thoughts on this?

eliotwrobson commented 9 months ago

This seems like a pretty reasonable change that shouldn't blow up the API too much. Go for it!

caleb531 commented 9 months ago

@Tagl This is a great idea, and seems very practical. I would suggest something like an optional max_length parameter with a default value of None (meaning no limit). And min_length with a default value of 0.

@eliotwrobson What is our convention in the codebase for length-related variable/parameter names? So we use "length" or "len"?

eliotwrobson commented 9 months ago

@caleb531 I think we usually use "length" (since "len" shadows the built-in function), but this might be inconsistent in some places.

caleb531 commented 9 months ago

@eliotwrobson Yeah, now that I'm at my personal machine, I performed a quick search in the codebase. I see many uses of "length", e.g.:

So far, the only deviation I can find is _populate_count_cache_up_to_len, but that is an internal method (which perhaps we should rename for consistency).

Regardless, I think our established convention seems pretty evident from the above. And as for max_length vs. max_word_length, the DFA.of_length() method omits "word" in the parameter names to give min_length and max_length, so I think the latter are what we want to use here.

eliotwrobson commented 5 months ago

@Tagl have you been able to take a look at this? I actually have a use case where this would be very handy, so I'd like to include it in the version we release once the current package review is complete.

Tagl commented 5 months ago

Oh I thought I'd made the PR for this already. My bad! I'll find my implementation and get a PR set up soon.