Closed szhu closed 6 years ago
Questions:
- Would it be possible to add a "Save AppleScript files as MacRoman" setting?
- Would it be a good idea for this setting to be enabled by default?
If MacRoman (still) is the default encoding, I'd rather go with the second option. However, I'm a bit hesitant since I wonder how this treats non-latin characters.
From the Mac OS Roman page on Wikipedia:
With the release of Mac OS X, Mac OS Roman and all other "scripts" (as the Mac OS called them) were replaced by UTF-8 as the standard character encoding for the Macintosh operating system
In general it's possible to define a default encoding for a language and I've added this to package.json
for testing purposes. I'll play around some and see whether it makes sense to keep this setting. Your thoughts on this are welcome!
I've decided to keep the default encoding, since its easy to change in the settings. Mac Roman is now the default encoding in v0.14.2!
Thanks for researching and addressing this so quickly!
I also wanted to take some time to talk about this:
From the Mac OS Roman page on Wikipedia:
With the release of Mac OS X, Mac OS Roman and all other "scripts" (as the Mac OS called them) were replaced by UTF-8 as the standard character encoding for the Macintosh operating system
(First, a side note– the quoted "script" above means "encoding", not "programming language", so it's not talking about AppleScript specifically.)
AppleScript is fairly anachronistic compared to the rest of macOS. The language syntax and the use of the scpt
save format as (which is similar to the Python and Java's compiled .pyc
/.pyo
/class
formats) as the default source code format seem fairly out of place in today's ecosystem of programming languages. Here are some other ways AppleScript hasn't really been updated since Mac OS 9:
Given all of this, I'm only mildly surprised that Apple didn't update AppleScript's default text encoding, either.
In general, I like that VSCode and Script Editor use the same encoding, it will be much more simple on a daily basis to work with the files.
But, the new version of the extension tries to open UTF8 files as MacRoman without changing the encoding. And so, I have issues with accentuated characters :
If I reopen the file with UTF8, it works fine. I don't know if VSCode could properly convert the files ?
One more issue : you can't use emojis with MacRoman encoding. Well, I guess you can using the unicode equivalent, but not with the emoji itself.
And I just tried, if you use an emoji inside the Script Editor and save as text, it uses UTF-16 encoding.
I did some tests myself with a simple AppleScript file, which contains non-ASCII characters:
display dialog "äöüßéè€"
With the files.autoGuessEncoding
setting active, Code will open the file as ISO 8859-2. I then did some further tests to determine the encoding:
# MacRoman
$ file -I macroman.applescript
macroman.applescript: text/plain; charset=unknown-8bit
$ xattr -l macroman.applescript
com.apple.FinderInfo:
00000000 54 45 58 54 54 6F 79 53 00 00 00 00 00 00 00 00 |TEXTToyS........|
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000020
com.apple.TextEncoding: macintosh;0
com.apple.lastuseddate#PS:
00000000 F0 CB B1 5B 00 00 00 00 20 94 64 3A 00 00 00 00 |...[.... .d:....|
00000010
com.apple.metadata:_kMDItemUserTags:
00000000 62 70 6C 69 73 74 30 30 A0 08 00 00 00 00 00 00 |bplist00........|
00000010 01 01 00 00 00 00 00 00 00 01 00 00 00 00 00 00 |................|
00000020 00 00 00 00 00 00 00 00 00 09 |..........|
0000002a
com.apple.metadata:kMDLabel_nzfct3nddxl2ablrfgw6suoak4:
00000000 F2 E0 5F 6D 26 33 ED 13 52 56 42 EE EC 33 B8 94 |.._m&3..RVB..3..|
00000010 F1 98 3E AA 33 79 03 F2 99 74 4C D2 65 DF 75 DD |..>.3y...tL.e.u.|
00000020 0B 13 F6 EA 11 50 09 76 ED E4 0D 2F 5B 7D F7 58 |.....P.v.../[}.X|
00000030 A7 FF D7 05 2F 34 E5 43 E9 41 32 5B EB A3 03 61 |..../4.C.A2[...a|
00000040 2D 82 95 14 BB 08 C9 2B 05 6A 5B 70 C8 A7 F8 84 |-......+.j[p....|
00000050 8E BE 43 B8 AD 9B 16 B6 BA |..C......|
00000059
# UTF-8
$ file -I utf8.applescript
utf8.applescript: text/plain; charset=utf-8
$ xattr -l utf8.applescript
com.apple.metadata:_kMDItemUserTags:
00000000 62 70 6C 69 73 74 30 30 A0 08 00 00 00 00 00 00 |bplist00........|
00000010 01 01 00 00 00 00 00 00 00 01 00 00 00 00 00 00 |................|
00000020 00 00 00 00 00 00 00 00 00 09 |..........|
0000002a
This did not really help, so I've created a second AppleScript file with only ASCII characters for comparison.
display dialog "abc"
This file is indeed encoded differently:
$ file -I ascii.applescript
ascii.applescript: text/plain; charset=us-ascii
$ xattr -l ascii.applescript
com.apple.FinderInfo:
00000000 54 45 58 54 54 6F 79 53 00 00 00 00 00 00 00 00 |TEXTToyS........|
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000020
com.apple.TextEncoding: us-ascii;1536
com.apple.lastuseddate#PS:
00000000 F2 CD B1 5B 00 00 00 00 FC 0B 4B 24 00 00 00 00 |...[......K$....|
00000010
com.apple.metadata:_kMDItemUserTags:
00000000 62 70 6C 69 73 74 30 30 A0 08 00 00 00 00 00 00 |bplist00........|
00000010 01 01 00 00 00 00 00 00 00 01 00 00 00 00 00 00 |................|
00000020 00 00 00 00 00 00 00 00 00 09 |..........|
0000002a
com.apple.metadata:kMDLabel_nzfct3nddxl2ablrfgw6suoak4:
00000000 F2 46 E1 33 A8 F7 49 0F 64 AC 59 96 5B 72 A4 7D |.F.3..I.d.Y.[r.}|
00000010 62 5B F9 1F 76 EE E5 EE AF 56 AC 58 2D 52 33 A7 |b[..v....V.X-R3.|
00000020 00 E0 5D E4 6F B6 08 9B 37 9A D6 04 3B E5 7B 80 |..].o...7...;.{.|
00000030 0E C4 28 6B C2 E3 8D C1 3E 67 E9 FD 7B 1A 37 44 |..(k....>g..{.7D|
00000040 16 4C 37 82 4F C9 BE 9D 07 24 C9 CB 54 CF 21 B3 |.L7.O....$..T.!.|
00000050 D7 70 5E 4A 7D 48 3D 53 05 |.p^J}H=S.|
00000059
Code will open this file as MacRoman.
As far as I know, the Code extension API is too limited to change the encoding depending on the contents of a script. The way to restore the old behavior is described in the README.
And I just tried, if you use an emoji inside the Script Editor and save as text, it uses UTF-16 encoding.
@nicolinuxfr I wonder if it's possible to have this extension try opening files as UTF-16, using MacRoman if that fails, and try saving files as MacRoman, using UTF-16 if that fails. Then that would mirror the Script Editor behavior you describe above.
@nicolinuxfr Can you be more specific about the encoding? Is it UTF-16 LE or BE?
I think UTF-16 LE, this is what BBEdit says :
If it can help, here's a really small script containing an emoji and saved with Script Editor as text : emojiscript.zip
Script Editor saves
.applescript
text scripts in the MacRoman encoding. It's similar enough to some other encodings that it can't be auto-detected from the content itself. Example:files.autoGuessEncoding
.x ≥ 1
as MacRoman. \ (You can either use Script Editor or VSCode.) \ Close the file after saving.x ³ 1
.Adding this in my VSCode settings informs VSCode of the proper encoding for AppleScript files:
It would be nice if this package did this automatically!
Questions: