JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.73k stars 5.48k forks source link

Unicodes \forall and \exists not working #19012

Closed hpoit closed 3 years ago

hpoit commented 8 years ago

julia> parse("∀x(cat(x) → mammal(x))")

LoadError: ParseError("invalid character \"∀\"") while loading In[16], in expression starting on line 2

in parse at parse.jl:180 in parse at parse.jl:190

julia> parse("∃x(sister(x,Spot) & cat(x))")

LoadError: ParseError("invalid character \"∃\"") while loading In[15], in expression starting on line 7

in parse at parse.jl:180 in parse at parse.jl:190

JeffBezanson commented 8 years ago

We do not currently parse these. I suppose we could parse them as macro calls to allow customization, or parse them as something that calls the all function. What would you propose?

JeffreySarnoff commented 8 years ago

Kevin wants this to work: parse("∃x( sister(x,Spot) & cat(x) )")

hpoit commented 8 years ago

What's the official representation of negation (~ or !)?

Sarnoff, if I follow the unicode list, it should actually be parse("∃x(sister(x,Spot) ∧ cat(x))")

A macro call with the all function seems appropriate for \forall and a macro call with logical or for \exists. Should I do it? Where is this all function defined? Does it just include all x in X?

JeffBezanson commented 8 years ago

It means they can be tab-completed to unicode characters by typing \forall<tab> in the REPL and editor modes. Many, many characters are supported in this way, including some emoji. Characters you can type are a superset of characters that can be parsed, which are in turn a superset of characters with default meanings.

Of course we can make these parseable, we just need to decide exactly what the syntax is and what it parses to.

~ and ! are functions, so they parse to function calls. An expression like ∃ x f(x) presumably can't just be a function call; in particular it might bind x.

hpoit commented 8 years ago

Okay, so why not start with parse("∀x(cat(x) → mammal(x))")? For any object x, if x is a cat then x is a mammal.

JeffBezanson commented 8 years ago

We need a proposal for the exact grammar of that form (i.e. which non-terminals are used to parse each subexpression), and the structure that it parses to in terms of Expr objects.

TotalVerb commented 8 years ago

It looks like it would suffice, and be simplest, for to parse as a unary macro (like a unary operator except as a macrocall instead of a call). I really don't think it's a good idea to make a separate parallel grammar for this logic stuff; the macro can implement those rules

That disallows ∃ x f(x) but ∃ x(f(x)) is not that much worse.

stevengj commented 8 years ago

and are not parsed because it was not clear to me (when I initially expanded the list of parsed math symbols) whether they should be identifiers, operators, macros, etcetera, and what their precedence should be. Once we make a decision, it is hard to change, so it is best not to make a choice casually here.

TotalVerb commented 8 years ago

I think parsing them as binary operators is not sensible. Eliminating that possibility eliminates the need to choose a precedence.That leaves the following options:

  1. just allow them as identifier characters
  2. parse them as standard unary operators
  3. parse them as unary macros (like if ~ were parsed like a unary operator instead of a binary one)
  4. parse them as if they were macrocalls directly, i.e. as if the token was basically equivalent to @forall.

1 seems dubiously useful to me. Chances are that these operators will only be useful within specific DSLs anyway, so 2 is acceptable for most cases even without macro parsing. 3 is strictly more flexible than 2 but is also strictly more complex. 4 is interesting but, I would venture, extremely hard to get right. Only 4 allows constructs like ∀ x f(x), but at the expense of such constructs being extremely brittle.

Note that all of 2, 3, and 4 allow ∀ x -> f(x). 3and 4 allow accessing the body and arguments of such an expression. That way seems preferable for expressing something like a universal or existential type, or a logical statement.

TotalVerb commented 8 years ago

One argument for 2 is that none of the other binary connectives (and unary negation) operators, nor any of the operators that could look like implication, are parsed as macros. So most consumers of this syntax will require @type(X -> Y) anyway, and then parsing these operators as easily as possible (as simple function calls) will be most easily useful.

stevengj commented 8 years ago

Yes, parsing them as unary operators seems the most sensible and conservative choice. However, I'm reluctant to implement anything absent a real application (as opposed to idle speculation on the mailing list).

hpoit commented 8 years ago

The application https://github.com/hpoit/Kenya.jl

hpoit commented 8 years ago

FOL in MLN http://homes.cs.washington.edu/~pedrod/papers/mlj05.pdf

KristofferC commented 8 years ago

Kenya.jl seems to only consist of empty files?

hpoit commented 8 years ago

I need FOL before anything else.

hpoit commented 8 years ago

In any case, FOL will be around for a long time, and there will probably be many others after me wondering if FOL can be used with Julia. I think ultimately I would like to see FOL being used with Julia, on the blockchain.

hpoit commented 8 years ago

There are many different representation schemes in use in AI, some of which we will discuss in later chapters. Some are theoretically equivalent to first-order logic and some are not. But first-order logic is universal in the sense that it can express anything that can be programmed. We choose to study knowledge representation and reasoning using first-order logic because it is by far the most studied and best understood scheme yet devised. Generally speaking, other proposals involving additional capabilities are still hotly debated and only partially understood. Other proposals that are a subset of first-order logic are useful only in limited domains. Despite its limitations, first-order logic will be around for a long time. - Peter Norvig, AIMA book, p. 186

stevengj commented 8 years ago

You don't need a special symbol to write code; you can always use ASCII symbols like forall. Vaporware is not a real application.

hpoit commented 8 years ago

MLN already exists http://i.stanford.edu/hazy/felix/. You're right, I don't need a special symbol to code.

stevengj commented 8 years ago

@hpoit, by real application I mean practical (non-vaporware) code in Julia with a clear reason why and how the code clarity would benefit from using , and what parsing semantics would be required to enable this.

Felix is written in Java and hence presumably does not (cannot) use , so it hardly explains why/how parsing is needed. Nor does general mathematical-logic usage of help — that is very different from explaining how should be used in a programming language.

hpoit commented 8 years ago

I just want to use the least verbose notation possible for the universal and existential quantifiers, that's all, and this value appears to be in line with Julia. Since you and Bezanson offered to help given that I propose an exact grammar in terms of Expr objects to structure a tree, I will look into it and come back to this issue when I have it, or if I'm in doubt. How does that sound?

TotalVerb commented 8 years ago

If you have a good proposal (that Jeff and Steven like too), I can implement it. (Though, I think the best implementation is just adding these operators to the unary operator list, but if you have other ideas, all the better.) I have a (barely started) FOL package that I've been meaning to work on sometime soon, and this will help with the syntax for me also.

hpoit commented 8 years ago

Magnifique

hpoit commented 8 years ago

Does the proposal have to be or would make easier if it was of Type 1 Grammar, like I understand is Julia? Apparently, MLN by default uses Type 2, probabilistic CFG, and doesn't offer an exact grammar in the sense that Bezanson mentioned, but learns the grammar from a corpus of texts.

In Markov logic, universal quantifiers are also probabilistic.

hpoit commented 8 years ago

The addition of the empty string allows the statement that the context sensitive languages are a proper superset of the context free languages, rather than having to make the weaker statement that all context free grammars with no →λ productions are also context sensitive grammars.

This seems to mean that PCFG is at times context-sensitive, like Julia, in the case when the probable parse is highly probable.

hpoit commented 8 years ago

Steven, there's some info here on parsing semantics like you asked, slide 84.

TotalVerb commented 8 years ago

@hpoit from a quick glance, that slide show contains nothing relevant. Semantic parsing is beyond the scope of the Julia parser.

JeffreySarnoff commented 8 years ago

Is this helpful see section 4?

stevengj commented 8 years ago

@JeffreySarnoff, no. We know what the symbols mean in mathematics and logic; that's separate from the question of how they are parsed.

hpoit commented 8 years ago

We can all just assume they will be parsed as unary operators, right Steven?

JeffBezanson commented 8 years ago

As I already said above, other operators parse as function calls but these might not: ∀ x p might introduce a new binding for x, so you can't just evaluate x and pass it to like you would for most operators (since x might not exist yet). In any case, moving forward with this requires considerably more detail.

TotalVerb commented 8 years ago

My suggestion was to require ∀ x -> p to introduce bindings, which allows these to be parsed as simple unary functions or macros. Is there actually a benefit in allowing the whitespace delimited form?

JeffBezanson commented 8 years ago

The whitespace-delimited form is closer to standard notation. But if is strictly a single-argument unary operator, ∀x(f(x)) can still parse since the argument is a function call x(f(x)). Syntactically that's workable, but a bit odd since the argument is likely not intended to be a function call.

hpoit commented 7 years ago

I intend to build Isabelle.jl to introduce Isabelle/HOL formal methods to Julia. Isabelle.jl should reuse Julia's compiler parsing and type-checking front-end to subsequently derive verification conditions to be solved by Isabelle.

References https://isabelle.in.tum.de/ https://arxiv.org/pdf/1607.01539.pdf (in our case from Isabelle to Julia; I don't believe from Scala to Isabelle makes any sense) https://www.isa-afp.org/entries/FOL_Harrison.shtml

With this, I'm not sure this issue is relevant any longer. The formal representation of these quantifiers should be in Isabelle, which will then be mapped to native Julia.

stevengj commented 7 years ago

We should still eventually decide what to do with these symbols in the parser.

hpoit commented 7 years ago

I will check out what Isabelle does with them and post it here.

On Feb 27, 2017, at 09:39, Steven G. Johnson notifications@github.com wrote:

We should still eventually decide what to do with these symbols in the parser.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

hpoit commented 7 years ago

In case you missed this (as I haven't heard from anyone in here), perhaps this is also a good reference to understanding what to do with quantifiers in the parser: Julia was used in the development of ACAS-X and a theorem prover was used to guarantee properties of the code. Maybe they've already figured it out. I will eventually come back to this.

StefanKarpinski commented 7 years ago

I did find myself wanting to write assertions in this style for some Pkg3 code:

@assert ∀ v ∈ inc ∃ p ∈ pairs p[1] ≲ v ≲ p[2]
@assert ∀ v ∈ exc ∄ p ∈ pairs p[1] ≲ v ≲ p[2]

Of course, what I actually ended up writing was not too terribly far away from that:

@assert all(any(p[1] ≲ v ≲ p[2] for p ∈ pairs) for v ∈ inc)
@assert all(!any(p[1] ≲ v ≲ p[2] for p ∈ pairs) for v ∈ exc)

I'm not sure if that's suggestive of a useful syntax, but that's what I wanted to write.

JeffreySarnoff commented 7 years ago

+1

tpoisot commented 6 years ago

+1 too -- I think this would allow for some very expressive syntax

pranshumalik14 commented 3 years ago

Any updates on when this could be included?

StefanKarpinski commented 3 years ago

There needs to be a definite proposal before there's a "this" to be included.

tk3369 commented 3 years ago

IMO, the following code is very readable. I'm not a fan of the other math-like version.

@assert all(any(p[1] ≲ v ≲ p[2] for p ∈ pairs) for v ∈ inc)
@assert all(!any(p[1] ≲ v ≲ p[2] for p ∈ pairs) for v ∈ exc)
StefanKarpinski commented 3 years ago

I agree, but I think we should make some syntax with and available for packages to use. The question is what.

pranshumalik14 commented 3 years ago

Yeah, that is what I wanted. Just nice syntactic sugar

StefanKarpinski commented 3 years ago

The simplest option here would be to just allow and in identifiers, including by themselves, and then one could write ∀(x, f(x)) or whatever. The originally desired syntax, which was ∀x(cat(x) → mammal(x)) would not work as that would be parsed as a function named ∀x applied to the expression cat(x) → mammal(x), which would be evaluated first, so x would be an undef error even if everything else was already defined.

0x0f0f0f commented 3 years ago

The simplest option here would be to just allow and in identifiers, including by themselves, and then one could write ∀(x, f(x)) or whatever. The originally desired syntax, which was ∀x(cat(x) → mammal(x)) would not work as that would be parsed as a function named ∀x applied to the expression cat(x) → mammal(x), which would be evaluated first, so x would be an undef error even if everything else was already defined.

julia> Symbol('∀')
Symbol("∀")

julia> Expr(:call, Symbol('∀'), :(x ∈ something), :(cat(x) => mammal(x)))
Expr(:call, Symbol('∀'), :(x ∈ something), :(cat(x) => mammal(x)))

I guess treating \forall and \exists as valid identifiers by now can be OK, the originally desired syntax could be handled with macros. For example,

@∀ (x ∈ something) cat(x) => mammal(x)
willow-ahrens commented 3 years ago

The association of and with all and any suggests that these symbols are reduction-like with operators & and |. In Julia, similar reduction-like symbols (that often have similar mathematical syntax) are parsed as identifiers, such as , , , or Π.

Even if we want to break that pattern and introduce the syntax ∀ v ∈ inc ∃ p ∈ pairs p[1] ≲ v ≲ p[2], I'm not aware of any multi-argument prefix operators in Julia other than macro calls, so it would be a special syntax case just for and .

I agree that and should be parsed as identifiers. I'll make a pull request implementing this change.

willow-ahrens commented 3 years ago

This issue is closed by #42314, I think.