cajhin / capsicain

Powerful low-level keyboard remapping tool for Windows
349 stars 18 forks source link

First line ignored in utf-8 ini with byte order mark #27

Closed bitoj closed 3 years ago

bitoj commented 3 years ago

I started with the default capsicain.ini from the v80 distribution and replaced all content with the following minimal configuration (originally to investigate something else).

GLOBAL ActiveConfigOnStartup 2

[CONFIG_1]
OPTION configName wrong

[CONFIG_2]
OPTION configName ok

When starting Capsicain, it produces the following output.

No ini setting for 'GLOBAL activeConfigOnStartup'. Setting default config 1

OPTIONs
off:     debug output for each key event
off:     Z <-> Y
off:     Alt <-> Win for Apple keyboards
off:     Left Control and Win block alpha key mapping ('Ctrl + C is never changed')
off:     Process only the keyboard that sent the first key

ACTIVE CONFIG: 1 = wrong

So the first line is silently ignored. No syntax error, unlike the behavior with unrecognized keywords in the second and later lines.

This issue only affects the first line and does not occur after I save the ini as UTF-8 without BOM.

Obvious work-arounds:

Currently a very small issue, but handling the BOM might be more relevant when a future version of Capsicain needs to process non-ASCII configuration data.

cajhin commented 3 years ago

That's good to know, thanks.

Not sure if there is an easy fix. C++ + UTF-8 = mess.

The planned next major change is to split capsicain into a C# app that parses the ini and does GUI/Windows stuff (Windows app instead of console), and a high performance C thread with the core loop that processes the key events.

With C#, all the UTF-8 support comes by default. It's a much much nicer language for certain tasks. Still not 100% through with the architecture though, especially how to log from C to the C# console without ever blocking the C thread.

cajhin commented 3 years ago

fixed. As before, all characters must be plain ASCII (except in comments, they are dropped). The BOM was already detected (I forgot); now the BOM bits are stripped instead of dropping the entire first line.

bitoj commented 3 years ago

Confirmed.

In some places any byte sequence seems to work, more or less, even if it is not ASCII:

GLOBAL ActiveConfigOnStartup 2
GLOBAL IniVersion Óñë

[CONFIG_2]
OPTION configName Éxämplè
INCLUDE T€st

[T€st]
COMBO E [..T.] > altchar(0128) # €

is accepted, with status report

Capsicain version: 89
ini version: Óñë
active config: 2 = Éxämplè

The Euro combo works.