eaburns / toaq

Tools for the constructed, logical language Toaq.
8 stars 1 forks source link

"ra" particle grabs statement-with-prenex causing unintuitive parsing #18

Open acotis opened 5 years ago

acotis commented 5 years ago

"lî shỉ na ra lî dẻ na bi hỏq" should parse as roughly "[(lî shỉ na) ra (lî dẻ na)] bi hỏq" but instead it parses as "(lî [shỉ na] ra [(lî dẻ na) bı hỏq]" (simplified).

eaburns commented 5 years ago

This bug isn't specific to Miu, but is a bug in toaq.org/parser too.

I suspect that we have a tough choice to make here: either consider the current behavior as working-as-intended or disallow content clauses from containing after-thought connected statements.

eaburns commented 5 years ago

If lu had a terminator, it would alleviate this problem (the above parse would not change, but we can terminate the li^ shi/ using the lu-terminator to get the, presumably, desired parse). I vaguely recall that we wanted a lu-terminator once before too, but I don't recall why.

acotis commented 5 years ago

lu gained a terminator since you were last in the community; it's cei. However, we are still left with the decision: if we consider this to be the intended behavior, then we have a lojban-like {ku joi} problem.

Why can't we just say that connector particles can only connect statements if they lack prenexes? After all, "Jí pa hỏq na ru hảo" already parses as "Jí pa [ (hỏq na) ru (hảo) ]", not grabbing the prenex into the first phrase.

eaburns commented 5 years ago

I hadn't realized how critical the prenex was to this issue.

After you asked, I played a bit. Removing the prenex does change it to give the more intuitive (in my opinion) parse, where the ru is connecting the content clauses.

The reason that the prenex changes the behavior is because without the prenex, li^ de? na is not a statement, thus cannot be connected as a statement to shi?.

If you add the prenex, li^ de? na bi hoq? is a complete statement, so it can thus be connected to the complete statement shi?, then all of that connected stuff is wrapped in the beginning li^.

eaburns commented 5 years ago

An asside

As for ji/ pa hoq? na ru hao?, please try http://toaq.org/parser/, I think you'll find it gives a different parse. It gives a parse where the prenex is captured and pushed down the left side of the connector phrase.

screen shot 2019-03-02 at 10 21 24 pm

This is a difference (a fix) from the original grammar to the Miu grammar.

I don't quite recall, but I think that the original grammar could only nest prenexes inside CoPs; there was no way to have a prenex that scoped over more than a single statement. This, coupled with the fact that variables only scope for a single sentence, was very problematic, because it meant that variables in the prenex could not scope over multiple predicates (if I recall correctly). Which made some constructs impossible to interpret in prenex-normal form (the most basic form, where all terms are moved to the prenex -- I could expand on this later, but I am feeling lazy, since it's after my bed time, and I'm not certain that anyone even wants it expanded).

Hoemai and I discussed this, and they agreed that I should/could switch it. But I always did get the feeling that Hoemai was really busy at the time and perhaps didn't think about it very deeply. If I recall, the response was something like "whatever you think is right," which worried me :-D. But I still do think that this change was right.

eaburns commented 5 years ago

I think you may be right (but I'm not in such a state-of-mind right now to make the code change and actually check it in without thinking further). I think that we can fix this by simply disallowing prenexes on the right-hand-side of a statement_CoP.

The change would be very simple: On line https://github.com/eaburns/toaq/blob/master/ast/toaq.peggy#L50 change afterthought_cop<statement_2, statement> to afterthought_cop<statement_2, statement_1>

It would disallow a prenex to be on the right-hand-side of an afterthought_CoP, but I don't think that ever causes a problem, because you can always move those prenex terms to the prenex of a parent statement that contains the entire CoP. Furthermore, since this is just restricting the grammar to allow fewer statements, it should play nicely with the current interpreter, so we shouldn't need to change code beyond what I describe above.

acotis commented 5 years ago

That's an interesting aside. I always had the feeling that prenexes scoping over multiple statements was a good thing, but I didn't realize that having it the other way was so problematic. (Though in a pinch, you could do something like Sa dó bi, jẻo hâo dó na ru hôq dó da.) Either way, I agree with you call that the current way is the right way :)

I'm glad and not surprised that the fix is simple. I'd rather not make it myself before looking a little deeper into the code to make sure I understand it, which I'll do soon (unless someone else does it first). Thanks for looking into this!

eaburns commented 5 years ago

Is this right?

echo "lî shỉ na ra lî dẻ na bi hỏq" | ./toaq
full_text{}
Text{
    Leading: nil
    Discourse: [
        StatementSentence{
            JE: nil
            Statement: PrenexStatement{
                Prenex: Prenex{
                    Terms: [
                        CoPArgument{
                            TO0: nil
                            TO1: nil
                            RU: "ra"
                            Left: PredicateArgument{
                                Focus: nil
                                Quantifier: nil
                                Predicate: LUContent{
                                    LU: "lî"
                                    Statement: Predication{
                                        Predicate: WordPredicate("shỉ")
                                        Terms: []
                                        NA: "na"
                                    }
                                }
                                Relative: nil
                            }
                            Right: PredicateArgument{
                                Focus: nil
                                Quantifier: nil
                                Predicate: LUContent{
                                    LU: "lî"
                                    Statement: Predication{
                                        Predicate: WordPredicate("dẻ")
                                        Terms: []
                                        NA: "na"
                                    }
                                }
                                Relative: nil
                            }
                        }
                    ]
                    BI: "bı"
                }
                Statement: Predication{
                    Predicate: WordPredicate("hỏq")
                    Terms: []
                    NA: nil
                }
            }
            DA: nil
        }
    ]
}
lî shỉ na ra lî dẻ na bı hỏq

({[<lî (shỉ na)> ra <lî (dẻ na)>] bı} hỏq)
Connection{
    Connector: "∨"
    Left: Predication{
        Predicate: "hoq"
        Arguments: []
        AST: nil
    }
    Right: Predication{
        Predicate: "hoq"
        Arguments: []
        AST: nil
    }
    AST: nil
}
hoq() ∨ hoq().

Checkout https://github.com/eaburns/toaq/commit/48d42cc750c1bbed5ba8d3abb465ab079b3dfcc2, and let me know what you think.

acotis commented 5 years ago

Oh wow, I didn't know it could be used from the command line so easily. That parse looks right to me! I'll check out the commit and play around with it.

acotis commented 5 years ago

I have failed to install Go :(

However, given that (as you pointed out) this simply restricts the grammar to a subset of what was allowed before, and given that afterthought Co's grabbing prenexed statements is unintuitive to me, I highly doubt this change would break anything in my eyes.