rupesh4514 / grammatical-framework

Automatically exported from code.google.com/p/grammatical-framework
0 stars 0 forks source link

Allow Unicode characters in identifiers #27

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
I think GF should allow Unicode characters in identifiers: both in module 
names, category names 
and function names.

I want to be able to write this in my grammar:

fun
någon : Det;
grön : Adj;
på : Prep;
björn : Noun;
där : Adv;

Original issue reported on code.google.com by peter.ljunglof@heatherleaf.se on 8 Mar 2010 at 2:29

GoogleCodeExporter commented 9 years ago
Hmm.. I submitted this as an enhancement, but that wasn't recognized. So I 
change it to enhancement now.

Original comment by peter.ljunglof@heatherleaf.se on 8 Mar 2010 at 2:30

GoogleCodeExporter commented 9 years ago
These very examples work already now, if you use isolatin1, which is the 
default coding flag. See e.g. 
lib/src/swedish/IrregSwe.gf.

I'm worried but permitting the full unicode letter set might open a bag of 
worms. For instance, there could be 
many similar-looking identifiers with different unicodes. And portability would 
be worse, since unicode support 
on different platforms is still very brittle. But this opinion is not extremely 
strong.

Original comment by aarne.ra...@gmail.com on 12 Apr 2010 at 9:40

GoogleCodeExporter commented 9 years ago
I don't think there will be any problems: Unicode has been allowed in XML 
identifiers for 10 years, and I haven't 
heard of any problems. 

Original comment by peter.ljunglof@heatherleaf.se on 12 Apr 2010 at 9:49

GoogleCodeExporter commented 9 years ago
This is kind of possible now by using quotes. For example 'жаба' is a valid 
identifier in cyrillic but you have to surround it with quotes. Whether this 
should be allowed without quotes is another question which we have not decided 
yet.

Original comment by kr.ange...@gmail.com on 29 Nov 2013 at 2:33