Sefaria / Sefaria-Project

New Interfaces for Jewish Texts
https://www.sefaria.org
647 stars 265 forks source link

Fix sefer hachinukh linker #1948

Open nsantacruz opened 3 weeks ago

nsantacruz commented 3 weeks ago

The linker requires a ResolvedRef to match all RefParts in the input to be considered valid. In the case where there are redundant parts we end of with two (or more) ResolvedRefs each of which only matches some of the RefParts. However, there may a some combination of ResolvedRefs that overlap and when combined span all the RefParts.

For example: ספר חיונוך לך לך ב. Currently the linker catches two partial matches: ספר החינוך ב and ספר החינוך לך לך. There is already logic to try to merge these. However since ספר החינוך לך לך matches an AltStructNode which has no associated ref (since this node has ArrayMapNode children but is not itself an ArrayMapNode) there wasn't an easy way to check for overlapping matches. This PR allows for this.