Closed Amamgbu closed 2 years ago
Looks like it's because the new regex doesn't just skip over <number>.<number>
, it also skips over <number.
. The reason the test fails is that one of the sentences ends with a date so is not correctly split up: He died at Vienna on 21 October 1762.\n\n\n\nAigen was born in Olomouc on 8 October 1685, the son of a goldsmith
) So yay our tests worked! I think we can probably abandon this PR then -- I don't see a great solution and like you said, the previous regex worked.
The regex fails when testing for complicated sections. Not sure why it counts it as a sentence change/remove. The previous regex passes the test.