def-gthill / lexurgy

A high-powered sound change applier
GNU General Public License v3.0
48 stars 5 forks source link

How do I write two suffixes at a time? #35

Closed nprghu closed 3 years ago

nprghu commented 3 years ago

I don't mean language suffixes, but Lexurgy's suffixes like $1 and *, which are the two ones I'd like to use at the same time. I tried this under "shortening:": {@vowel, !@vowel} **$1* {@vowel, !@vowel}$2 $$ $1 {@vowel, !@vowel}$3 => $1 $2 $3 What I meant by this is something like how it is turns into it(')s in English. But it may happen on a larger scale maybe like: anko ante to ankote. This is definitely not natural, but this is not a Conlang. Just practice with Lexurgy! Maybe I read the documentation wrong, but can you help me out? Thanks! 😄

def-gthill commented 3 years ago

To answer your immediate question, you can use parentheses to break up the suffixes: ({@vowel, !@vowel}*)$1. When Lexurgy gets confused by a complex rule, parentheses are often the solution. (This will be better documented in the future, once the behaviour of these expressions is more stable)

But the rule as written is actually fundamentally flawed in the current version of Lexurgy, because backtracking isn't implemented; Lexurgy figures out something to match ({@vowel, !@vowel}*), stores that in variable 1, and moves on, without looking at the rest of the rule first.

This means in a phrase like anko ante, Lexurgy doesn't know about the second an when it decides what to put in variable 1. As it happens, in the current implementation ({@vowel, !@vowel}*) matches nothing (i.e. it always repeats zero times), so the rule finds a match at o a (with o in variable 2 and a in variable 3) and produces ankoante.

For the rule to work as you want, Lexurgy would have to keep trying different numbers of repetitions until it finds one that matches both copies. I'm planning to implement this eventually, but for now we need a workaround.

Try this:

start-marker:
 * => | / $ _

find-matching propagate:
 | []$1 ([]*)$2 $$ ([]*)$3 | $1 => $1 | $2 $$ $3 $1 |

non-matches:
 | => * / $ _

contract:
 $$ ([]*) | => *

remove-marker:
 | => *

This produces the following result:

anko ante        => ankote
butra butsi      => butrasi
samindor saminke => samindorke
jorji porji      => jorji porji

If you have any questions about how this works, let me know. It's good Lexurgy practice to work through this and understand why it works!

nprghu commented 3 years ago

Thank you! That helped a lot, without sarcasm! 👍