Closed sroomf closed 1 year ago
cc @johnseekins @bfossen-ce your thoughts are appreciated!
:wave: @showerst has done a lot of the heavy lifting on action classifiers & will be able to weigh in on places where the action text alone isn't enough to go on. That isn't to say this proposal doesn't move things in the right direction, just that it may need to encompass some other fields/data.
As a minor, related, nit, I'd perhaps consider altering the structure of your proposed data structure from
_actions = {
"bill action phrase": {
"type": "string comparison or regex",
"mappings": ["OS classification mapping"]
},
"read the second time": {
"type": "compare",
"mappings": ["reading-2"]
},
"^introduced": {
"type": "regex",
"mappings": ["introduction"]
},
}
To
actions = [
{"type": "compare", "value": "read the second time", "mappings": ["reading-2"]},
{"type": "regex", "value": "^introduced", "mappings": ["introduction"]},
]
For flexibility, or take it a step further even and introduce ClassifierRule
classes:
actions = [
ClassifierRule(regex=r"^introduced", mappings=["introduction"]),
ClassifierRule(string=r"read the second time", mappings=["introduction"]),
]
Which would allow for flexibility down the line.
Either way, glad to see this happening!
This looks like a great idea! It would be wonderful to have all that messy old code standardized.
Edge cases to think about:
Thank you for your fantastic suggestions/comments/insights @showerst and @jamesturk ! Super helpful 👍
Tim, I can definitely see how detailed and varied these edge case jurisdictions are. I imagine the categorize_actions()
function will probably look slightly different from jurisdiction to jurisdiction, depending on how action classification needs to be handled for each one.
For flexibility, or take it a step further even and introduce
ClassifierRule
classes:actions = [ ClassifierRule(regex=r"^introduced", mappings=["introduction"]), ClassifierRule(string=r"read the second time", mappings=["introduction"]), ]
Which would allow for flexibility down the line.
@jamesturk, would you be able to elaborate on what the ClassifierRule
class would look like? I haven't worked with classes in this way before!
Glad to!
What I was thinking was that there could be a very simple interface like this: (this isn't quite right, but hopefully gets the point across)
class ClassifierRule(ABC):
@abstractmethod
def classify_action(self, action) -> set[str]:
pass
And then:
class RegexRule(ClassifierRule):
def __init__(self, regex, mappings):
self.regex = regex
self.mappings = set(mappings)
def classify_action(self, action):
if self.regex.match(action.text):
return self.mappings
else:
return set()
class CustomRule(ClassifierRule):
def classify_action(self, action):
# let's say a state had custom numeric codes per action
# (I believe NM?) this method could then just look those
# codes up in a dictionary and return those
Then, you'd have a list of ClassifierRule
subclasses, most/all might be RegexRule (or PrefixRule or whatever), but you'd have the option to incorporate/run custom code per action.
You'd apply them something like:
for action in actions:
classification = set()
for rule in state.action_classifiers:
classification |= rule.classify_action(action)
action.classification = list(classification)
This could be extended to also have a way to set whether or not once a rule matches it should halt rule processing.
Hope this is more clear, let me know if I can clarify anything.
Creating an Open States Enhancement Proposal for updating action classification. In summary, we would like to move all code related to action classification into its own
actions.py
for each jurisdiction. The EP contains details and justification for this change.