GrammaticalFramework / gf-core

Grammatical Framework core: compiler, shell & runtimes
https://www.grammaticalframework.org
Other
131 stars 35 forks source link

[Question] A multiplatform/Java, command-parsing environment #89

Closed s5bug closed 3 years ago

s5bug commented 3 years ago

Here's my situation: I'm writing something that takes natural commands, like chalkbot, set my timezone to America/Los_Angeles/チョークボット、タイムゾーンをアメリカのロサンゼルスに設定して or chalkbot remind 24h1m do stuff. I'm developing on a Windows machine, but I want to move this over to a Linux machine for the "production environment."

I'm worried about

  1. The names for the different grammars being such like Eng instead of en_US or equivalent: I'm making use of ICU4J for locales and timezones.
  2. JNI, moving my application from Windows to a Linux machine, and maybe wanting to use Graal Native-Image.
  3. How to maintain grammars I'll need alongside the application, for example parsing human-readable durations/dates in the remind command.

What should I do/how should I set this up? Perhaps add a backend for directly generating Java code from the grammar files? If I've done a lot of Scala and a little bit of Haskell, would that be easy? Or should I just go the JNI route, compile the Java interface myself, and give up on native-image?

inariksit commented 3 years ago
  1. The language part of the grammars can be anything, it's just a common convention in the RGL to use the 3-letter code. This example works perfectly fine:
-- Abstract (Test.gf)
abstract Test = {
  cat
    S ;
  fun
    s : S ;
}
-- Concrete 1 (in a separate file, TestEnGb.gf)
concrete TestEnGb of Test = {
  lincat
    S = Str ;
  lin
    s = "test (en_GB)" ;
}
-- Concrete 2 (in a separate file, TestEnUs.gf)
concrete TestEnUs of Test = {
  lincat
    S = Str ;
  lin
    s = "test (en_US)" ;
}

And how does it work with morphology? If you write mkV "travel", you will get the US version "travel, traveled". But if you give the past tense as a second argument, you get "travelled" instead. Like this (output from th GF shell):

> i -retain ParadigmsEng.gf
> cc -table mkV "travel"
s . VInf => travel
s . VPres => travels
s . VPPart => traveled
s . VPresPart => traveling
s . VPast => traveled

> cc -table mkV "travel" "travelled"
s . VInf => travel
s . VPres => travels
s . VPPart => travelled
s . VPresPart => travelling
s . VPast => travelled

For any other lexicon, if you want one grammar to say "color" and the other "colour", you need to give them different linearisations. In your abstract, you have a noun like fun color_N : N, and then in your concretes, you have these:

lin color_N = mkN "color" for en_US

and

lin color_N = mkN "colour" for en_GB.

  1. If you want to develop your main application in Java, here is some documentation. http://www.grammaticalframework.org/doc/runtime-api.html#java If you're open for Haskell or Python, I'd recommend one of them, because they are more mature than the Java and C# bindings. If you run into problems with the Java bindings, you can try to search the errors on the GF mailing list https://groups.google.com/g/gf-dev .

  2. I'm not quite sure what you mean by maintain. But it might be useful to read this post https://inariksit.github.io/gf/2019/12/12/embedding-grammars.html I only cover code examples for Python and Haskell, but the general ideas of how to use GF grammars from other applications should still be valid.

Unfortunately I have no idea about Java or how to set up Java-specific things. You might have better luck on the mailing list, either searching or browsing the topics, or posting your own question.

s5bug commented 3 years ago
  1. Cool. I'll probably write "wrappers" around the RGL that map locale codes to language codes.
  2. I'll see about writing a PGF library for Java. That looks like the best approach for what I want. I should be able to use the C and Haskell libraries to at least get started on something.
  3. By maintain, I mean "where to put/update the grammars." I think what I'll do is I'll have them in a separate project, and then "build" that separate project and use the build artifacts in my main project. I don't know how well that will work though.
inariksit commented 3 years ago
  1. Yes, sounds like a good approach.
  2. You mean like, an API on top of JPGF that looks more like the Haskell PGF library?
  3. Ah, yes. The .gf files are source code that can be anywhere you want, and the compiled .pgf files are what the rest of your program reads.

Good luck with your project! If you have any questions on the actual GF grammar writing, I would be happy if people posted their GF questions on Stack Overflow. If you have questions about specifically Java part, then your best bet is to send email to the mailing list, there are more readers than on GitHub issues.

s5bug commented 3 years ago

I mean like a pure Java implementation of something that reads/loads/parses from PGF files, so that I don't have to be on a JNI barrier. I can look at the Haskell PGF library for this I'm fairly certain.

I'll definitely try the mailing list if I run into issues.

s5bug commented 3 years ago

Huh? Is it erroneous that the website says that the Windows download doesn't contain a Java API?

image

What's this JAR?

johnjcamilleri commented 3 years ago

Yes I think it's a mistake on the download page, it is in fact included.