mscarey / AuthoritySpoke

Reading legal authority for the last time
https://authorityspoke.readthedocs.io
Other
34 stars 2 forks source link

Organize text anchors imported from JSON #81

Closed mscarey closed 3 years ago

mscarey commented 4 years ago

When a JSON file like holding_lotus.json is read, the output should be formatted into three NamedTuples like this.

anchored_holdings=List[AnchoredHolding(Holding, TextPositionSet, List[TextQuoteSelector])]
anchored_enactments=List[AnchoredEnactment(Enactment, TextPositionSet, List[TextQuoteSelector])]
anchored_factors=List[AnchoredFactor(Factor, TextPositionSet, List[TextQuoteSelector])]

This replaces the previous approach of using Enactments and Holdings as dict keys. They're no longer frozen so they can't be used as keys.

The TextPositionSet and list of TextQuoteSelectors are redundant, but I don't know which will be more useful for locating quoted passages in the opinion. Any change to the Opinion text (or maybe markup) would throw off the TextPositionSet, and Opinions are much larger and harder to structure than Enactment nodes.

mscarey commented 4 years ago

New option. Maybe just leave the anchors on the objects where they're stored in the JSON input format, but add an Opinion.enactments method that dedupes the Enactments and puts all selectors for the same Enactment on that one copy of the Enactment. And then there can be a method that finally removes the anchors/selectors from the Enactment when it's moved off of the Opinion object to a context where its text links to the Opinion are no longer relevant.

mscarey commented 3 years ago

The solution I used for this was to create the classes HoldingWithAnchors, TermWithAnchors and EnactmentWithAnchors. Changes were required to the JSON loading process, but they're done now.