def-gthill / lexurgy

A high-powered sound change applier
GNU General Public License v3.0
46 stars 5 forks source link

Anchored and start at #64

Closed neta-elad closed 2 years ago

neta-elad commented 2 years ago

Hey Graham, another small PR:

Consider the changes:

syllables: # 1
    C V+

clusters cleanup:
    C => * / $ _ C

hiatus cleanup:
    V => * / V _

syllables: # 2
    C V

rule1:
    C => C C / $ _

syllables: # 3
    C C V
    C V

clusters:
    off

another-rule:
    unchanged

rule2:
    unchanged

For the input CCVCV and startAt=rule2 the expected output (I would think) is CCV.CV. Lexurgy currently throws a syllabification error (as far as I can tell, it tries to apply both # 1 and # 2, in order). This fixes it, and also ensures that cleanup rules that aren't be relevant starting at rule2 will not be applied.

There is still a small issue with CleanupOff rules anchored at the rule Lexurgy starts at. Without another-rule between the clusters: off and rule2, the output is CV.CV. I could try to fix that, but I'm not sure what the intended semantics are - let me know.

def-gthill commented 2 years ago

Looks good! Sorry this took so long, anything to do with anchored steps confuses me to no end.

If I understand correctly, yes, the intended semantics is that a cleanup-off rule anywhere before the start-at rule (including immediately before it) nullifies the corresponding cleanup rule so that it never runs at all.

neta-elad commented 2 years ago

Hey Graham no problem, sorry for also opening an issue.

So right now there's still a problem - because the another-rule shouldn't be necessary but it is. Otherwise the cleanup-off is anchored to the start-at rule, and then it runs after the persistent effects were already run.

Changing the order will affect current behavior, and probably isn't correct in general. I can add some piece of code that in the case of "just starting" see which anchored rules are cleanup-off and run them before, but that feels like a hack.

def-gthill commented 2 years ago

So I pondered this for a while, and concluded that cleanup rules should never run before the start-at rule. In your test, the word CCVCVV should be invalid as an input word when the start rule is set to rule2; the prevailing syllable rule should declare the word invalid before the cleanup rule hiatus has a chance to run (hiatus should run for the first time after rule2).

The way I think about the semantics of start-at and stop-before is that we imagine running all the sound changes, but peeking at the words right before the chosen sound change:

clusters:
    off

# Peek at the words right here!

rule2:
    unchanged

If you specify stop-before rule2, then the output words you get should be exactly the same as if you ran the entire file and just peeked at the words immediately before rule 2. That means that e.g. hiatus should have run one more time after rule1.

If you specify start-at rule2, then the rest of the file should behave exactly the same as if you ran the entire file, peeked at the words immediately before rule 2, and they happened to be exactly your input words. That means that hiatus should not run before rule 2; in the full run, it would've already run at that point, so it shouldn't run again.

Another way to think of it is that stop-before rule2 followed by feeding those exact same words back into start-at rule2 should have exactly the same output as just running all the sound changes.

Hope that makes sense. I've made the change needed to produce this behaviour --- moving the check for whether to set started = true after the persistent cleanup rule block --- in commit befd4f5be7044a4cca8f06bc4aaf1ceae2df187a. Let me know if there are any other cases where the implementation violates the above rules.