cjdrake / pyeda

Python EDA
BSD 2-Clause "Simplified" License
304 stars 54 forks source link

Less restricted variable names #132

Closed shader closed 8 years ago

shader commented 8 years ago

Right now the variable names in pyeda are restricted to those that match [a-zA-Z][a-zA-Z0-9_]*

I would like to have variables that represent additional relationships between other variables, and the nicest way to do so would be to permit operators within the variable names. It's true that this would allow variable names not immediately usable in python expressions, but I was going to automatically generate them and store them in dictionaries anyway.

Effectively, I am requesting that instead of only white-listing a limited set of variable names, you simply exclude those characters you think detract from the operation of the system, if any. That way I could use *, >, :, etc. as operators in my variable names.

johnyf commented 8 years ago

Using non-alphanumeric characters in variable names rarely has a good outcome. It is better (explicit) to store the relationships between variables in dictionaries or other suitable data structures dedicated to that purpose. Allowing strange variable names will eventually lead to some difficulty, even if a conflict does not currently exist.

For example, suppose that a new external and compiled tool is interfaced, but it takes as input only simple variable names. Then, a translation would be needed anyway, adding an extra level of complexity. Flat is better than nested, and simple better than complex (PEP 20). Such translations are easier to deal with at the surface, meaning outside PyEDA, and where the user has easier access and knows better the context.

shader commented 8 years ago

I understand the concern with using non-standard variable names, and fully agree that they should be avoided where possible. However, your arguments apply more to the current implementation than my proposal.

The change required is not an extension of the framework, but rather the removal of an unnecessary restriction. It does not change anything with respect to how PyEDA operates or how it would interact with external platforms. No additional translation would be required simply by reducing the restrictions. Any restrictions that apply to an external tool should be handled at that interface, not in PyEDA, as you also suggest. Interfaces always need error checking, and it's equally possible that an interfaced system might be more flexible and have more features rather than fewer.

I am merely proposing that the restrictions be adjusted to match the actual requirements of PyEDA, so that I am not forced to add additional, nested, complex translation just to support something the system would be able to do with less than one line of modification already.

cjdrake commented 8 years ago

The restriction seems arbitrary, but the reason for its existence is to match the behavior of the variable constructor with the boolean expression parser.

So the code from line 94 of pyeda/boolalg/boolfunc.py:

        if not re.match(r"^[^\d\W]\w*$", name, flags=re.UNICODE):
            fstr = "expected name to match (letter|'_')(letter|digit|'_')*, got {}"
            raise ValueError(fstr.format(name))

matches the code from line 187 of pyeda/parsing/boolexpr.py:

            (r"[a-zA-Z_][a-zA-Z0-9_]*", name),

(this is an entry in a lexical analysis table).

That said, it's true you could just comment out lines 94-96 of pyeda/parsing/boolexpr.py, and the engine doesn't really care what you name your variables since the internal representation only preserves a mapping from the name to the variable object.

Recently I've been hacking on a symbolic boolean algebra library in C++, boolexpr. In that library, I made different decision about how variables names work. You have to define a Context, which is sort of a local namespace for variables, and you can name them whatever you want. This wasn't because it mattered to me that people want to name variables weird things like a<b, but rather that there are an abundance of transformations (such as Tseytin) that add auxiliary variables to expressions, and there needs to be an easy way to organize them.

Anyways, kind of a long blurb. I'm okay with removing the variable name restriction, since it doesn't really harm anything. Feel free to submit a pull request.

shader commented 8 years ago

Thanks for the response, Chris. That helps explain the reasoning behind the restriction in more detail.

I'm not opposed to having some restrictions, though from my understanding of what you said the only purpose right now is to match the parsing code.

Would it make sense to have a 'strict' flag somewhere that defaults to true, but can be disabled to remove the restriction? Or would it be better to leave the restriction out altogether?

cjdrake commented 8 years ago

No need for a strict flag. It would be an unnecessary pain to know about.

Make sure to run the unit test suite, b/c I think there's a test for this behavior currently.