In the first version of the matcher Status was defined as
data Status = Inactive | Active Bool
and Bool signified that a regular expression is a final state of the underlying DFA. In the next version Status is defined as
data Status = Inactive | Active [Int]
and [Int] is a list of start indices of words that are accepted in the current state of the underlying DFA. Bool and [Int] were/are merged using the associative operations (||) and (++) respectively. The corresponding zeros (that is, False and []) were/are used when combining regular expressions in sequence to drop the status of the first component if the second does not accept the empty word.
This calls for generalising the Status type to
data Status m = Inactive | Active m
for arbitrary Monoids m. The type Monoid.Any can be used to solve the word problem and the type Data.Set Int can be used to track indices (which replaces the manual implementation of mergeIndices by Set.union). Apart from re-implementing the old, different monoids can be used to implement new behaviour. For example
newtype Min = Min { getMin :: Int }
instance Monoid Min
where
mempty = Min 0
a `mappend` b = min (getMin a) (getMin b)
can be used to implement leftmost (longest or shortest) match.
The type RegExp a needs to be changed to RegExp m a too.
In the first version of the matcher
Status
was defined asand
Bool
signified that a regular expression is a final state of the underlying DFA. In the next versionStatus
is defined asand
[Int]
is a list of start indices of words that are accepted in the current state of the underlying DFA.Bool
and[Int]
were/are merged using the associative operations(||)
and(++)
respectively. The corresponding zeros (that is,False
and[]
) were/are used when combining regular expressions in sequence to drop the status of the first component if the second does not accept the empty word.This calls for generalising the
Status
type tofor arbitrary
Monoid
sm
. The typeMonoid.Any
can be used to solve the word problem and the typeData.Set Int
can be used to track indices (which replaces the manual implementation ofmergeIndices
bySet.union
). Apart from re-implementing the old, different monoids can be used to implement new behaviour. For examplecan be used to implement leftmost (longest or shortest) match.
The type
RegExp a
needs to be changed toRegExp m a
too.