katydid / regex-derivatives-coq

Apache License 2.0
21 stars 7 forks source link

Simpler Logical operators for Brzozowski than nor #145

Closed awalterschulze closed 3 years ago

awalterschulze commented 3 years ago

Currently all logical operators are defined using the nor operator. This has proven to be very inconvenient and possibly overkill on my part.

I wonder if it would make more sense to define only the or, not and and operators and be happy with that as enough of an extended regular expression.

The positive is that proves for concat, that requires or, would be more natural. The negative is that we lose the possibility to handle all boolean operators.

Other idea is to make it really easy rewrite between the or regex constructor and disjunction in the property space.

This is something @Nielius and I discussed in a previous session, but didn't come to a conclusion. I think it is worth continuing this discussion.

CC @paulcadman

paulcadman commented 3 years ago

@awalterschulze - Is it possible to define or in terms of nor and use it as an abstraction? Also I'm not sure what you mean by the property space (I probably haven't caught up with this yet).

awalterschulze commented 3 years ago

Regarding property space: I just meant that we could proof a theorem to help to easily rewrite the or regex constructor to disjunction \/.

awalterschulze commented 3 years ago

We currently define or in terms of nor, but when we take the derivative of concat, it uses this or definition, which unfolds to:

Definition or (r s: regex) : regex :=
  nor (nor r s) (nor r s).

This means we are in a position where we need to untie at least two nots and start relying on decidability a lot.

I also suspect there is a way to abstract or better without including it in the inductive definition of regex. I am just not sure how yet.

awalterschulze commented 3 years ago

I suspect the work @Nielius and I started https://github.com/awalterschulze/regex-reexamined-coq/pull/144 is a start towards this better abstraction, but it feels like we are missing a puzzle piece.

awalterschulze commented 3 years ago

Thinking about it again. It is possible to represent all logic operators, using only two logic operators. or is the natural one we want, since it is used to define concat and concat is used again by star. We only need to add the not operator, which then allows us to represent nor

P | Q | P nor Q
-----------------
T | T | F
T | F | F
F | T | F
F | F | T

P | Q | not (P or Q)
-----------------
T | T | F
T | F | F
F | T | F
F | F | T

, which allows us to represent all logic operators, for example and

P | Q | P and Q
-----------------
T | T | T
T | F | F
F | T | F
F | F | F

P | Q | not ((not P) or (not Q))
-----------------
T | T | T
T | F | F
F | T | F
F | F | F

I think we should replace nor with not and or. It is still a minimal extension, but allows us to more naturally work for concat and star, which are typically the hardest proofs we have to do.

What do you think @paulcadman and @Nielius ?

Nielius commented 3 years ago

I think that makes sense. We're not introducing too many new constructors and these constructors are probably easier to work with.

I do still wonder if it would be good to add an and constructor to the regex inductive type. It feels a bit strange that we can't just say and p q (for regexes p and q), but instead have to express it as not ((not p) or (not q)), especially because it requires a difficult-to-prove theorem (the decidability of regexes) to conclude that they define the same language. And also from a sort of "philosophical" perspective: are and p q and not ((not p) or (not q)) really "the same" regexes? They denote the same language, but there are many regexes that denote the same languages, but we still don't consider them the same ("the same" as elements of the regex inductive type).

On the other hand, introducing another constructor for the inductive type does make the proofs a bit longer.

Nielius commented 3 years ago

On a related note: on the level of languages (of type str -> Prop), I think it probably makes a lot of sense to have or_lang, not_lang and and_lang, because without decidability, we can't really express and in terms of not and or. (We can postpone introducing and until we need it --- we might not ever need it.)

awalterschulze commented 3 years ago

I have attached a pull request, which replaces nor with or and not. As you say we can easily add and later, when we see we need it.

Nielius commented 3 years ago

Yes, I'm looking at the PR now :).