GrammaticalFramework / gf-core

Grammatical Framework core: compiler, shell & runtimes
https://www.grammaticalframework.org
Other
131 stars 35 forks source link

Create GF source files from a CF file #150

Open inariksit opened 1 year ago

inariksit commented 1 year ago

I want to be able to do this:

$ cat myGrammar.cf
S ::= "Hello" "World" ;

$ gf -f gf myGrammar.cf
Writing myGrammar.gf myGrammarCnc.gf

$ cat myGrammar.gf
abstract myGrammar = {
cat S ; 
fun S_Hello_World : S ;
}

concrete myGrammarCnc of myGrammar = {
lincat S = Str ; 
lin S_Hello_World = "Hello" ++ "World" ;
}

I infer from this line that this has been an intention 12 years ago. https://github.com/GrammaticalFramework/gf-core/blob/85038d01750c56241d45686d14c513f72421526c/src/compiler/GF/CompilerAPI.hs#L63

However, this line shows that it has never worked:

https://github.com/GrammaticalFramework/gf-core/blob/85038d01750c56241d45686d14c513f72421526c/src/compiler/GF/CompilerAPI.hs#L89

I see also that the conversion from cf goes directly into PGF here without going to GF source code first: https://github.com/GrammaticalFramework/gf-core/blob/85038d01750c56241d45686d14c513f72421526c/src/compiler/GF/Compiler.hs#L111-L123

Is there any way to piece together generation of GF source code from existing code, such as using any functions that produce canonical GF? I have already tried to use the -f canonical_gf flag, but it doesn't work for cf files as input, only for gf files.

anka-213 commented 1 year ago

It doesn't look like it's a simple case of just connecting the components. These are the functions responsible for producing the canonical gf grammar for canonical_gf

https://github.com/GrammaticalFramework/gf-core/blob/85038d01750c56241d45686d14c513f72421526c/src/compiler/GF/Compile/GrammarToCanonical.hs#L36-L37 https://github.com/GrammaticalFramework/gf-core/blob/85038d01750c56241d45686d14c513f72421526c/src/compiler/GF/Compile/GrammarToCanonical.hs#L74-L75

and they require a GF.Grammar.Grammar.Grammar, which seems to contain a lot of details from a GF file:

https://github.com/GrammaticalFramework/gf-core/blob/85038d01750c56241d45686d14c513f72421526c/src/compiler/GF/Grammar/Grammar.hs#L85-L105

They convert it to the GF.Grammar.Canonical.Grammar format:

https://github.com/GrammaticalFramework/gf-core/blob/85038d01750c56241d45686d14c513f72421526c/src/compiler/GF/Grammar/Canonical.hs#L22-L131

which is then printed directly using render80

https://github.com/GrammaticalFramework/gf-core/blob/85038d01750c56241d45686d14c513f72421526c/src/compiler/GF/Compiler.hs#L74-L78


So if one manage to write a function for converting a GF.Grammar.CFG.ParamCFG into either a GF.Grammar.Grammar.Grammar or a GF.Grammar.Canonical.Grammar, the rest would be simple, but the question is how to do that conversion. Perhaps the cfg2pgf function can at least be a source of inspiration:

https://github.com/GrammaticalFramework/gf-core/blob/85038d01750c56241d45686d14c513f72421526c/src/compiler/GF/Compile/CFGtoPGF.hs#L20-L21

inariksit commented 1 year ago

I see, thanks for digging into it @anka-213!

Abstract syntax is quite easy to copy and paste together, once you open the cf file in the GF shell and then type the commands pg -cats and pg -funs, copy those into a file and surround with the required abstract myGrammar = { cat … fun … }. But I couldn't find anything for producing the concrete.

Motivation for my question is to recreate this work https://github.com/smucclaw/sandbox/tree/default/aarne#readme , where the pipeline involves automatically producing first a CF grammar, then converting it to a GF grammar and continuing to refine the rules manually. But it may well be that the first step of string-based GF grammar is not even necessary, and one could jump right into RGL-based concrete syntax, which inevitably needs human effort. (Or an automated script, but that's not a matter of GF compiler to do it.)