Inconsistent scanning of properties within sets

Outside of sets, Regexp::Scanner recognizes two types of nonproperty:

S = Regexp::Scanner
S.scan /\p{ascii}/    # => [[:property,    :ascii, "\\p{ascii}", 0, 9]]
S.scan /\p{^ascii}/   # => [[:nonproperty, :ascii, "\\p{^ascii}", 0, 10]]
S.scan /\P{ascii}/    # => [[:nonproperty, :ascii, "\\P{ascii}", 0, 9]]

Within sets, only one of them is recognized as nonproperty:

S.scan /[\p{ascii}]/  # => [..., [:set, :ascii, "\\p{ascii}", 1, 10], ...]]
S.scan /[\p{^ascii}]/ # => [..., [:nonproperty, :ascii, "\\p{^ascii}", 1, 11], ...]]
S.scan /[\P{ascii}]/  # => [..., [:set, :ascii, "\\P{ascii}", 1, 10], ...]]

And I guess you would actually see it as a bug that \p{^...} does not get the type :set like everything else in a set, is that right? That would be easy to fix, and I think it would make sense from the viewpoint of consistency to change that.

However, fixing that would still make it necessary to scan the data of properties in sets for /\\(P|p\{\^)/ to detect whether they are negative. That is different from how they are handled outside of sets and different from how classes are handled, as classes have their "polarity" encoded at index 1:

S.scan /[[:ascii:]]/  # => [..., [:set, :class_ascii, "[:ascii:]", 1, 10], ...]
S.scan /[[:^ascii:]]/ # => [..., [:set, :class_nonascii, "[:^ascii:]", 1, 10], ...]

IMHO the ideal solution would be if all the information that tokens generate outside of sets was also available when they occur in a set. I am aware that this would require a substantial refactoring and would not be backwards compatible. But it would greatly help acting on specific tokens (escapes are another example) wherever they occur, without having to scan the token data.

Any thoughts?

ammar / regexp_parser

Inconsistent scanning of properties within sets #28