godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
88.71k stars 20.12k forks source link

Add support for Unicode characters in GDScript identifiers #24785

Closed kirk-github closed 4 years ago

kirk-github commented 5 years ago

Godot version:

3.0.6

OS/device including version:

windows

Issue description:

Steps to reproduce:

Minimal reproduction project:

timoschwarzer commented 5 years ago

Because that's the specification. This may be because legacy Python only supports these, too. (https://docs.python.org/3.7/reference/lexical_analysis.html#identifiers)

kirk-github commented 5 years ago

but blender's python support unicode to be var name

timoschwarzer commented 5 years ago

Blender's Python is Python 3. Regardless, I don't think GDScript's purpose is to replicate or mimic the Python language. And GDScript's specification says that Unicode characters are not valid for identifiers. I don't know the cause but I think it's because of simplicity.

kirk-github commented 5 years ago

fine,thank you

timoschwarzer commented 5 years ago

This could be labeled as feature proposal though.

akien-mga commented 5 years ago

I might be biased since I'm native speaker of a latin language, but beyond the "oh cool" effect of being able to write code like try { (╯°□°)╯︵ ┻━┻ } catch() { ┬─┬ ノ( ゜-゜ノ) }, is it really useful and practical to use non-ASCII characters in identifiers?

As far as I know, all keyboard layouts can quite easily input latin characters, while it's pretty difficult on most layouts to input a random Unicode character, or CJK characters if you don't have a proper IME set up, etc. Having identifiers in Simplified Chinese characters used e.g. in a FOSS library would make it very difficult for any non-Chinese user to modify this code (and of course to understand it, but that's another issue).

If there are good practical use cases for this, it could be considered, but otherwise it sounds like making the language more complex for little gain.

Zireael07 commented 5 years ago

Some languages have minimal pairs with some characters being accented or having other diacritics and others not. I believe Godot complains hugely when I try to have a ł or a ń or some other Polish character in a variable name - and given the existence of such minimal pairs in both Polish and Spanish, well, it'd help if I could have two different variable names for such pairs instead of having to look for synonyms.

timoschwarzer commented 5 years ago

In my opinion, since the language's keywords are English, it's a good practice to have English identifiers, too. Because allowing Unicode characters is like if I'd ask to translate all keywords to all languages' translated counterparts. Yes, it's easier to allow Unicode identifiers than aliasing all keywords, but the result is the same: You can't read GDScript anymore. You can only read GDScript if it's written in a language you understand. And English is the international language (thus the keywords are English), that's just how it is.

Zireael07 commented 5 years ago

I have seen code written by a Finnish guy with Finnish variable names, and I understood what the code does without knowing any Finnish. I also once did it with German and Spanish and Russian, which I do know a bit of. So yes, knowing the language the variable names use helps, but is not a requirement. (Yes, I agree that English variables are in general a good practice in open source code, but some people might want to use their own language if they do not intend the code to be for public, international by extension of the nature of the internet, consumption)

And to be clear, I'm not advocating ALL Unicode, just the accented latin characters. CJK in variable names is a bad idea either way.

mateusfccp commented 5 years ago

I agree with this. I want to write variables as var → instead of var right_arrow, specially as I'm the only one working on my code.

girng commented 5 years ago

it's gdscript.. keep it to gdscript syntax only imo

imdjs commented 5 years ago

not all the people in the world like using English to name the variables, especially asians include me. most of them are not good at English or some of them just like to use mother language.

adabru commented 5 years ago

My understanding so far:

usecase "cool"

"oh cool" effect of being able to write code like try { (╯°□°)╯︵ ┻━┻ } catch() { ┬─┬ ノ( ゜-゜ノ) } by akien-mga

making the language more complex for little gain. by akien-mga

usecase "symbolic characters"

I want to write variables as var → instead of var right_arrow, specially as I'm the only one working on my code. by mateusfccp

2👍, 1☺ by kirk-github, FlamyAT, KoBeWi

it's pretty difficult on most layouts to input a random Unicode character by akien-mga

usecase "internationalization"

not good at English or some of them just like to use mother language by imdjs

English is the international language (thus the keywords are English), that's just how it is by timoschwarzer

The internationalization tracker issue https://github.com/godotengine/godot/issues/3081 conveys the message that akien-mga pursued a global internationalization solution in the past and that many individuals (want to) use internationalized components of Godot.

IMO

I'd guess the internationalization issue has the most weight. Especially people with only little knowledge of English could be enabled to use Godot. I observed German students using German identifiers in a university Javascript team-project. That's probably not good for an international project but for a local team or for education I guess that's valuable.

I guess Introducing Unicode support for the GDScript parser is not much overhead and it could be a requirement for solving issues like Internationalization of Project Settings or future internationalization issues. Quoting the issuer of the mentioned issue:

This will help so many Game Developers that I know here in Brazil. Thank You. PS.: I can help the translation if you let me help :) by AlmirNeeto99

apaza610 commented 5 years ago

English may be international but I really need unicode here in asia, trying to keep only english characters makes coding unnecesarily confusing.

rjdgheart commented 5 years ago

Different countries, different languages,but you can use your native language in GDscript,it is so cool. Hope it will come true.

imdjs commented 5 years ago

I using these unicode words to name the variables instead of english(coding C++, python), it's quite visualized to me ㄥ radian Λ angle Γ cross 卜 dot Ж collide 囗 face 厶 triangle α rotate 乛 direction
巜 move 十 add 一 erase Χ multiply Ξ divide 丨 or 丿write 丶and 乚 handle 冖 length 丅 location 丄 normal 工 normolised 匚 const 冂 with 凵 empty 卩what
卝 slice 厽 hierarch 彡 layer 灬 clear 巛 inverse 噩 store 吅 copy 罒 find
Σ cache Δ function Θ loop Φ axis Ψ pointer Π global ι local Ω main ω sub ξ index φ time Ф radius 2.png

freekoy commented 4 years ago

You can do this. var cc = {} cc["→"] = 9

or compile like .ts convert .js like my project https://github.com/freekoy/gulp-ac

txj-mssl commented 4 years ago

Add support for Unicode characters in GDScript identifiers,What is the result?add or no add?

clayjohn commented 4 years ago

Feature and improvement proposals for the Godot Engine are now being discussed and reviewed in a dedicated Godot Improvement Proposals (GIP) (godotengine/godot-proposals) issue tracker. The GIP tracker has a detailed issue template designed so that proposals include all the relevant information to start a productive discussion and help the community assess the validity of the proposal for the engine.

The main (godotengine/godot) tracker is now solely dedicated to bug reports and Pull Requests, enabling contributors to have a better focus on bug fixing work. Therefore, we are now closing all older feature proposals on the main issue tracker.

If you are interested in this feature proposal, please open a new proposal on the GIP tracker following the given issue template (after checking that it doesn't exist already). Be sure to reference this closed issue if it includes any relevant discussion (which you are also encouraged to summarize in the new proposal). Thanks in advance!

nobodxbodon commented 3 years ago

Just FYI, this PR allows Unicode characters, which seems close to the first solution mentioned in this proposal:

Don't really care about anything, just let any character above ASCII range to be considered part of an identifier

Hopefully it's helpful for those who wants a short-term and quick solution.