keyboardio / Kaleidoscope

Firmware for Keyboardio keyboards and other keyboards with AVR or ARM MCUs.
http://keyboard.io
GNU General Public License v3.0
757 stars 258 forks source link

Using Key as case label #309

Closed noseglasses closed 4 years ago

noseglasses commented 6 years ago

I just stumbled over another thing with the Key union that might be optimized.

Currently it is not possible to use constexpr Key as case label in switch-statements. The reason is that there is no way to auto-cast to unsigned int.

switch(k) {
   case Key_A:
      ...
      break;
error: could not convert 'Key{Key_::<anonymous struct>{4u, 0u}}' from 'Key' to 'unsigned int'

An attempt to use e.g. Key_A.raw instead for a case statement that tests for key A will fail as the constexpr union is initialized via the keyCode/flags pair and not the raw member, which the compiler does not like either.

switch(k.raw) {
   case Key_A.raw:
      ...
      break;
error: accessing 'Key_::raw' member instead of initialized 'Key_::<anonymous>' member in constant expression

The only solution I found, but one that is brittle is to use

switch(k.raw) {
   case (Key_A.keyCode << 8) | (Key_A.flags: 
      ...
      break;

The problem with that is that it will fail for those Key's that are initialized directly via the raw member as the compiler will again complain.

I am not sure it this can be remedied by defining a constexpr cast operator in class Key that casts to uint16_t.

@jwakely: Do you have any ideas if this can be solved?

jwakely commented 6 years ago

Stop using unions and macros and use modern C++ features instead? :-)

If Key_A is defined as a proper constant, not a macro that creates a temporary:

constexpr Key Key_A{  HID_KEYBOARD_A_AND_A, KEY_FLAGS };

and the union is replaced by a struct with a simple constructor that does the work of combining two uint8_t values into a single uint16_t value:

struct Key {

  constexpr Key(uint16_t raw) : raw(raw) { }

  constexpr Key(uint8_t keyCode, uint8_t flags)
  : raw((flags << 8) | keyCode) { }

  constexpr uint8_t keyCode() const { return raw & 0xff; }
  constexpr uint8_t flags() const { return raw >> 8; }

  uint16_t raw;

  // comparison operators ...
};

then you have a proper constant that doesn't rely on unions and can use it in a case label:

switch (key.raw) {
case KeyA.raw:
  // ...

You could add a conversion operator to uint16_t to allow using the values directly, instead of the .raw member, but I think that's a bad idea. The minor convenience of some syntactic sugar in switch statements should not overrule type safety. A Key is not just an integer, it shouldn't convert to one implicitly.

noseglasses commented 6 years ago

That's exactly the answer I expected :-) And to be frank, it is what I thought, too. I hope that the more people opt for a replacement of that vicious union, the earlier it will be replaced ;-)

I thought that there might be some clever workaround that I overlooked and that I could use for the time being to fool the compiler to let me read the .raw member from a constexpr Key that was actually initialized through .keyCode and .flags.

This issue is strongly related to #272.

noseglasses commented 6 years ago

The minor convenience of some syntactic sugar in switch statements should not overrule type safety. A Key is not just an integer, it shouldn't convert to one implicitly.

Agreed. I would be perfectly find with writing something like

case Key_A.raw:
   ...
jwakely commented 6 years ago

I thought that there might be some clever workaround

I don't think so. Unions are not really compatible with constant expressions.

As a further refinement, the raw member could be made private (and e.g. renamed to m_raw or raw_) and a const accessor function added. If that accessor was constexpr you could still use it in switches:

  switch(key.raw()) {
  case Key_A.raw():
noseglasses commented 6 years ago

Ok, thanks.

gedankenexperimenter commented 6 years ago

It's a bit disconcerting that the compiler objects to that usage in a case label, but not elsewhere (I get no errors when using it in a if statement, for instance. I can't think of a reason to disallow that usage, and a union really does describe the Key object – even moreso if combined with bit fields. One of the members of the Key union could be:

struct {
  uint16_t keycode : 8, mods : 5, meta : 3;
} keyboard;

Having one such bit field for each Key variant that Kaleidoscope understands (Consumer Control, System Control, Mouse, Layer changes, plugins) can make the code much clearer, and possibly prevent some programming errors. But if the Key type is a class or struct, I don't see any way to replicate that clarity, especially inside the definition of the type itself. Derived classes seem like an obvious solution, but not viable because the object must be limited to 2 bytes, which means virtual functions can't be allowed.

@jwakely : is there some alternative that I'm not seeing that could replicate the simplicity and clarity of using the bitfields, but without using a union?

jwakely commented 6 years ago

It's a bit disconcerting that the compiler objects to that usage in a case label, but not elsewhere

Why? Case labels must be compile-time constants of integral type, an if is a far more general (run-time) check.

and a union really does describe the Key object

Except that now you have implicit dependencies on the endianness of the hardware. I know the Model 01 hardware is fixed, but the code can still be made to work equivalently for big endian or little endian.

But if the Key type is a class or struct, I don't see any way to replicate that clarity, especially inside the definition of the type itself.

What's wrong with accessor functions, as I showed above?

struct Key {

  constexpr Key(uint16_t raw) : raw(raw) { }

  constexpr Key(uint8_t keyCode, uint8_t flags)
  : raw((flags << 8) | keyCode) { }

  constexpr uint8_t keyCode() const { return raw & 0xff; }
  constexpr uint8_t flags() const { return raw >> 8; }

  constexpr uint8_t mods() const { return flags() >> 3; }
  constexpr uint8_t meta() const { return flags() & 0x7; }

  uint16_t raw;

  // comparison operators ...
};

Or to support different bitfield layouts, provide "view" types that allow accessing the key with different interfaces:

struct Key {
    // ...

    struct Keyboard {
        Key key;
        constexpr uint8_t keyCode() const { retrn key.keyCode(); }
        constexpr uint8_t mods() const { return key.flags() >> 3; }
        constexpr uint8_t meta() const { return key.flags() & 0x7; }
    };

    Keyboard asKeyboard() const { reutrn { *this }; }

    struct SomethingElse {
        Key key;
        constexpr uint8_t someOtherProperty() const { return key.flags() & 0xf; }
    };

    SomethingElse asSomethingElse() const { return { *this }; }
};

Derived classes seem like an obvious solution, but not viable because the object must be limited to 2 bytes, which means virtual functions can't be allowed.

Derived classes don't have to have virtual functions.

There are lots of ways to solve this that are better than unions.

noseglasses commented 6 years ago

I am working with tons of different C++ libraries and luckily very few of them rely on unions. But every time I encounter unions they just cause trouble. The problem is that they appear so useful and tempting at first glance...

gedankenexperimenter commented 6 years ago

Why? Case labels must be compile-time constants of integral type, an if is a far more general (run-time) check.

Because I'm not that familiar with all the hidden intricacies of C++. Also because even if Key_A is a constexpr compile-time constant, whether or not I can use Key_A.raw as a case label depends on which union member of Key_A was assigned to when it was created, but the compiler will accept a comparison of any of the union members in an if test. I can't think of any reason for the compiler to reject the one but accept the other.

Except that now you have implicit dependencies on the endianness of the hardware. I know the Model 01 hardware is fixed, but the code can still be made to work equivalently for big endian or little endian.

Correct me if I'm wrong, but if the union members were all uint16_t bit fields instead of one uint16_t and a two-member uint8_t struct, endianness would not be an issue. Using bit fields would be really helpful with at least one type of Key – Consumer Control:

struct {
  uint16_t keycode : 10, meta : 6;
} consumer;

What's wrong with accessor functions, as I showed above?

Nothing's wrong with those accessor functions, but they lack the simple clarity of the bit field definition. It would make it much easier to avoid errors like the one in your example:

  constexpr uint8_t mods() const { return flags() >> 3; }
  constexpr uint8_t meta() const { return flags() & 0x7; }

That's incorrect, and requires much more thought to figure out what bits correspond to what values, and how big each bit field is. It should be:

  constexpr uint8_t mods() const { return flags() & 0x1F; } // low 5 bits
  constexpr uint8_t meta() const { return flags() >> 5; } // high 3 bits

That wouldn't require careful thought in order to get it right with a bit field, and it would be immediately clear to someone unfamiliar with the code what data was being represented, and how. The mods member requires decoding to figure out that it's five bits. The meta member requires subtraction. Both of those things are easy for humans to get wrong.

Your second example might be the answer I was looking for, but I'm not sure; it seems like it would require copying the object in order to use it in the way I'm hoping for (those secondary structs would have the bit fields), so I think it would be less efficient than the union?

I'm not trying to argue that unions are great and wonderful here. Obviously, there are problems with them (though from my limited experience, those problems seem to be arbitrarily created by the compiler, rather than intrinsic). But I also haven't seen an example that solves those problems without introducing other ones.

jwakely commented 6 years ago

I can't think of any reason for the compiler to reject the one but accept the other.

How Kaleidoscope uses unions is actually undefined behaviour according to both C and C++, but allowed by GCC as an extension (see the description of type-punning at https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fstrict-aliasing for details).

According to the language standards unions are not for reinterpreting one data type as a different data type. It's for a type that can contain either one type, or the other, but not both simultaneously (as you're doing when you write a uint16_t values and read two uint8_t values). Type-punning relies on non-portable assumptions and is not allowed in constexpr evaluations.

if the union members were all uint16_t bit fields instead of one uint16_t and a two-member uint8_t struct, endianness would not be an issue

Yes, but why do you even need a union in that case? If you just want to be able to access specific bits of a uint16_t then, y'know, do that. Bit-fields are OK, but lose the union.

It would make it much easier to avoid errors like the one in your example

If it matters which bits are which, then yes, there was an error. I was assuming as long as your getters and setters are consistent about which bits are which then it's OK, but of course in this situation the rights bits need to be used for the right purpose (sorry).

The mods member requires decoding to figure out that it's five bits. The meta member requires subtraction. Both of those things are easy for humans to get wrong.

That's why you create a clear API that gets it right, you write tests to check it and keep it right, and then you use the API.

it seems like it would require copying the object in order to use it in the way I'm hoping for (those secondary structs would have the bit fields), so I think it would be less efficient than the union?

Unless you're also worried the compiler can't efficiently copy a uint16_t then you should stop worrying. As far as the compiler is concerned the types in my examples are trivially-copyable just like an integer. In other words, this:

Keyboard asKeyboard() const { return { *this }; }

has exactly the same performance as:

uint16_t get() const { return this->raw; }

The compiler generates the same code for returning a struct containing a uint16_t as for returning a uint16_t (https://godbolt.org/g/8avxSu)

I'm not trying to argue that unions are great and wonderful here. Obviously, there are problems with them (though from my limited experience, those problems seem to be arbitrarily created by the compiler, rather than intrinsic).

The compiler is letting you do more with unions here than the language standard allows. The problems are intrinsic to unions.

If you want the additional checking that comes from static assertions, compile-time evaluation, and better type-safety then stop using unions.

obra commented 6 years ago

(I feel like I should mention that the existing implementation @jwakely is fixing here was the result of me, an inexperienced C++ programmer, cargo-culting without really understanding what he was doing.)

gedankenexperimenter commented 6 years ago

The compiler is letting you do more with unions here than the language standard allows. The problems are intrinsic to unions.

My mistake. The language standard could clearly allow unions to work in the way we want them to, but has instead rendered them all but useless. The compiler has made them work almost the way we want them to, but not quite, turning them into a trap for the inexperienced.

Unless you're also worried the compiler can't efficiently copy a uint16_t then you should stop worrying. As far as the compiler is concerned the types in my examples are trivially-copyable just like an integer.

…but in order to access data from the Key structure as any of its actual types, we now need to copy it first, whereas that copy wasn't necessary before? I really can't tell, but it seems like there must be extra steps involved.

According to the language standards unions are not for reinterpreting one data type as a different data type. It's for a type that can contain either one type, or the other, but not both simultaneously (as you're doing when you write a uint16_t values and read two uint8_t values). Type-punning relies on non-portable assumptions and is not allowed in constexpr evaluations.

From a logical perspective, what I want to do with the Key object isn't type punning. I want to use certain bits to determine which variant of a Key a given object is, and treat it accordingly. (Those bits would be meta bits in my examples, and are different lengths based on the different variants, which is why the order of the members is important in those examples). Ideally, this would be done with derived classes that re-define the base class data member raw as a variant-specific bit field, but as far as I can tell, C++ doesn't provide a way to do that except by explicit bit-twiddling, which I'd prefer not to use because it's confusing and difficult to maintain (compared to actual bit fields).

It seems that unions can't be used (by strict language standard) to do this without recording something extra for every instance to let us know which variant the union is storing, which makes them useless on an embedded device with very limited memory, the whole point being to store information in a space-efficient way. If we rely on the non-standard gcc extension to determine the Key variant, it should still be possible to use a union (even with case labels, because we know which variant Key_A is, so we can use the correct accessor, but only after we check the variant). But that's still dangerous because of the "undefined behaviour", and not 100% portable. Have I got that right?

The alternative is to do all the bit-twiddling that bit fields handle for us in a set of classes, one for each Key variant. We still need to store Key objects in an array of that type, however, and we can't use virtual functions because that would increase the size. If I understand correctly, all of the options have serious drawbacks, and it's a matter of opinion which drawbacks are the worst.

jwakely commented 6 years ago

but in order to access data from the Key structure as any of its actual types, we now need to copy it first, whereas that copy wasn't necessary before? I really can't tell, but it seems like there must be extra steps involved.

The copy can be avoided, but doing so won't improve clarity or performance.

gedankenexperimenter commented 6 years ago

The copy can be avoided, but doing so won't improve clarity or performance.

Whether or not it will improve clarity is subjective, and we may reasonable disagree on that point.

I believe your claim that it won't improve performance to avoid copying the data before reading it, but I can't see how that could be true. I'd rather understand it than simply trusting the expert.

noseglasses commented 6 years ago

Guys, your discussion about whether or not a copy is acceptable is pointless.

If the accessors are inlined like in jwakely's example, the compiler can optimize the apparent copy away.

I tried three different versions of a simple access test (see below). I had to slightly modify the original example to get it to compile and I generated another version that is close to the original in terms of having a Key as as member of a class Keyboard2 that is not a inner class of Key.

All three version (compiled with g++ for x86_64) yield the exact same executable (compared with diff). I also added the disassembly of the main(...) function generated.

Given the abilities of todays compilers to optimize stuff, readability and simplicity of user code remain the most important targets. That's why an implementation without unions is preferable. And views are great to wrap the different data representations of actual keys. The view implementation uses composition which is a quite transparent approach. Derived key classes in contrast would IMHO be an abuse of a language feature and less readable and straight forward.

#include <cstdint>

class Key {

   private:

      uint8_t keyCode_;
      uint8_t flags_;

   public:

      constexpr Key(uint8_t keyCode, uint8_t flags) : keyCode_(keyCode), flags_(flags) {}

    struct Keyboard {
        // Member key cannot be of incomplete type Key. Therefore, I made it const Key &key.
        const Key &key;
        constexpr uint8_t keyCode() const { return key.keyCode(); }
        constexpr uint8_t mods() const { return key.flags() >> 3; }
        constexpr uint8_t meta() const { return key.flags() & 0x7; }
    };

    Keyboard asKeyboard() const { return { *this }; }

    constexpr uint8_t keyCode() const { return keyCode_; }
    constexpr uint8_t flags() const { return flags_; }

};

struct Keyboard2 {
   Key key; // Here we can use Key directly
   constexpr uint8_t keyCode() const { return key.keyCode(); }
   constexpr uint8_t mods() const { return key.flags() >> 3; }
   constexpr uint8_t meta() const { return key.flags() & 0x7; }
};

constexpr inline
Keyboard2 asKeyboard(const Key &k) { return { k }; }

int main(int argc, char **argv) {

   Key aKey(argc%255, (argc + 13)%255);

   // Version 1 - external view class
   // return asKeyboard(aKey).keyCode() + asKeyboard(aKey).mods();

   // Version 2 - inner view class
   // return aKey.asKeyboard().keyCode() + aKey.asKeyboard().mods();

   // Version 3 - no view
   // return aKey.keyCode() + (aKey.flags() >> 3);

   // Compiled with
   //    g++ (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
   //
   // g++ -std=c++11 -o facade_test facade_test.cpp

   // Version 1
   // 8496 bytes

// 0000000000000560 <main>:
//  560:   8d 47 0d                lea    0xd(%rdi),%eax
//  563:   48 63 d0                movslq %eax,%rdx
//  566:   89 c1                   mov    %eax,%ecx
//  568:   48 69 d2 81 80 80 80    imul   $0xffffffff80808081,%rdx,%rdx
//  56f:   c1 f9 1f                sar    $0x1f,%ecx
//  572:   48 c1 ea 20             shr    $0x20,%rdx
//  576:   01 c2                   add    %eax,%edx
//  578:   c1 fa 07                sar    $0x7,%edx
//  57b:   29 ca                   sub    %ecx,%edx
//  57d:   89 d1                   mov    %edx,%ecx
//  57f:   c1 e1 08                shl    $0x8,%ecx
//  582:   29 d1                   sub    %edx,%ecx
//  584:   48 63 d7                movslq %edi,%rdx
//  587:   48 69 d2 81 80 80 80    imul   $0xffffffff80808081,%rdx,%rdx
//  58e:   29 c8                   sub    %ecx,%eax
//  590:   89 f9                   mov    %edi,%ecx
//  592:   c1 f9 1f                sar    $0x1f,%ecx
//  595:   c1 f8 03                sar    $0x3,%eax
//  598:   83 e0 1f                and    $0x1f,%eax
//  59b:   48 c1 ea 20             shr    $0x20,%rdx
//  59f:   01 fa                   add    %edi,%edx
//  5a1:   c1 fa 07                sar    $0x7,%edx
//  5a4:   29 ca                   sub    %ecx,%edx
//  5a6:   01 d7                   add    %edx,%edi
//  5a8:   40 0f b6 ff             movzbl %dil,%edi
//  5ac:   01 f8                   add    %edi,%eax
//  5ae:   c3                      retq   
//  5af:   90                      nop

//    return asKeyboard(aKey).keyCode() + asKeyboard(aKey).mods();

   // Version 2
   // 8496 bytes

// 0000000000000560 <main>:
//  560:   8d 47 0d                lea    0xd(%rdi),%eax
//  563:   48 63 d0                movslq %eax,%rdx
//  566:   89 c1                   mov    %eax,%ecx
//  568:   48 69 d2 81 80 80 80    imul   $0xffffffff80808081,%rdx,%rdx
//  56f:   c1 f9 1f                sar    $0x1f,%ecx
//  572:   48 c1 ea 20             shr    $0x20,%rdx
//  576:   01 c2                   add    %eax,%edx
//  578:   c1 fa 07                sar    $0x7,%edx
//  57b:   29 ca                   sub    %ecx,%edx
//  57d:   89 d1                   mov    %edx,%ecx
//  57f:   c1 e1 08                shl    $0x8,%ecx
//  582:   29 d1                   sub    %edx,%ecx
//  584:   48 63 d7                movslq %edi,%rdx
//  587:   48 69 d2 81 80 80 80    imul   $0xffffffff80808081,%rdx,%rdx
//  58e:   29 c8                   sub    %ecx,%eax
//  590:   89 f9                   mov    %edi,%ecx
//  592:   c1 f9 1f                sar    $0x1f,%ecx
//  595:   c1 f8 03                sar    $0x3,%eax
//  598:   83 e0 1f                and    $0x1f,%eax
//  59b:   48 c1 ea 20             shr    $0x20,%rdx
//  59f:   01 fa                   add    %edi,%edx
//  5a1:   c1 fa 07                sar    $0x7,%edx
//  5a4:   29 ca                   sub    %ecx,%edx
//  5a6:   01 d7                   add    %edx,%edi
//  5a8:   40 0f b6 ff             movzbl %dil,%edi
//  5ac:   01 f8                   add    %edi,%eax
//  5ae:   c3                      retq   
//  5af:   90    
//  
//    return aKey.asKeyboard().keyCode() + aKey.asKeyboard().mods();

   // Version 3
   // 8496 bytes

//    0000000000000560 <main>:
//  560:   8d 47 0d                lea    0xd(%rdi),%eax
//  563:   48 63 d0                movslq %eax,%rdx
//  566:   89 c1                   mov    %eax,%ecx
//  568:   48 69 d2 81 80 80 80    imul   $0xffffffff80808081,%rdx,%rdx
//  56f:   c1 f9 1f                sar    $0x1f,%ecx
//  572:   48 c1 ea 20             shr    $0x20,%rdx
//  576:   01 c2                   add    %eax,%edx
//  578:   c1 fa 07                sar    $0x7,%edx
//  57b:   29 ca                   sub    %ecx,%edx
//  57d:   89 d1                   mov    %edx,%ecx
//  57f:   c1 e1 08                shl    $0x8,%ecx
//  582:   29 d1                   sub    %edx,%ecx
//  584:   48 63 d7                movslq %edi,%rdx
//  587:   48 69 d2 81 80 80 80    imul   $0xffffffff80808081,%rdx,%rdx
//  58e:   29 c8                   sub    %ecx,%eax
//  590:   89 f9                   mov    %edi,%ecx
//  592:   c1 f9 1f                sar    $0x1f,%ecx
//  595:   c1 f8 03                sar    $0x3,%eax
//  598:   83 e0 1f                and    $0x1f,%eax
//  59b:   48 c1 ea 20             shr    $0x20,%rdx
//  59f:   01 fa                   add    %edi,%edx
//  5a1:   c1 fa 07                sar    $0x7,%edx
//  5a4:   29 ca                   sub    %ecx,%edx
//  5a6:   01 d7                   add    %edx,%edi
//  5a8:   40 0f b6 ff             movzbl %dil,%edi
//  5ac:   01 f8                   add    %edi,%eax
//  5ae:   c3                      retq   
//  5af:   90                      nop

//    return aKey.keyCode() + (aKey.flags() >> 3);
}
noseglasses commented 6 years ago

Just noticed that the compile flags I wrote in the last post are wrong. They were actually

g++ -std=c++11 -O3 -o facade_test facade_test.cpp

The -O3 is important. Without it the results differ.

jwakely commented 6 years ago

-O1 or -Os should work too.

obra commented 6 years ago

I'm not comfortable making decisions about this stuff based on the output of the latest version of gcc for x86 or x86-64.

I would -hope- that the results are the same, but I've been bitten before.

As painful as it is, avr-gcc 4.9.2 compiled with -Os is the reference compiler for the only production platform for Kaleidoscope at this point. I do hope that that changes, as compilers have gotten noticeably better since gcc 4.9.x was released four years ago.

jwakely commented 6 years ago

AVR GCC 4.6.4 -Os -std=c++0x (and defining constexpr to an empty string, because GCC 4.6 doesn't support it completely) shows the code is the same for each versions 1 and 2, but not version 3: https://godbolt.org/g/JhbCZC

gedankenexperimenter commented 6 years ago

Given the abilities of todays compilers to optimize stuff, readability and simplicity of user code remain the most important targets.

I absolutely agree with this statement, but the problem is that readability and simplicity are subjective measures that depend on the human beings who are looking at that code, and what's readable and simple to someone who has spent decades writing and thinking about C++ code is not necessarily readable and simple to someone who hasn't.

gedankenexperimenter commented 6 years ago

Would it be possible to use reinterpret_cast to treat a Key object with just one uint16_t data member to a struct like the following?

struct KeyboardKey {
  uint16_t keycode : 8, mods : 5, variant : 3;
};
obra commented 6 years ago

I brought up a local instance of Compiler Explorer and pointed it at the reference compiler.

Using the compilation options defined by Arduino:

-c -g -O -std=gnu++11 -fno-exceptions -ffunction-sections -fdata-sections -fno-threadsafe-statics -MMD

f1, f2 and f3 all generate the exact same assembly:

f3(Key const&): mov r30,r24 mov r31,r25 ld r18,Z ldd r24,Z+1 ldi r25,0 ldi r19,3 1: asr r25 ror r24 dec r19 brne 1b add r24,r18 adc r25,__zero_reg__ ret

jwakely commented 6 years ago

Would it be possible to use reinterpret_cast to treat a Key object with just one uint16_t data member to a struct like the following?

No. That would violate "strict aliasing".

The right way is to define conversions between the types (ideally explicit conversions, so you know when they're happening) and rely on the compiler to make it efficient.

gedankenexperimenter commented 6 years ago

The right way is to define conversions between the types (ideally explicit conversions, so you know when they're happening) and rely on the compiler to make it efficient.

In other words, the "right" way requires obtuse bit-twiddling code. Disappointing.

obra commented 6 years ago

In other words, the "right" way requires obtuse bit-twiddling code. Disappointing.

There's a lot about any programming language or compiler that's disappointing. If we encapsulate and document it, we should be ok.

gedankenexperimenter commented 6 years ago

There's a lot about any programming language or compiler that's disappointing. If we encapsulate and document it, we should be ok.

Oh, certainly. If C++ didn't have unions or bit fields, I wouldn't be feeling disappointed that the standard arbitrarily renders them useless by refusing to allow usage that it so obviously could allow.

jwakely commented 6 years ago

What's useless about bit-fields? Why can't you combine the "view" approach with bit-fields?

I'm arguing against unions for type-punning and reinterpret_cast, because both are ways of bypassing the language rules to say "I know what I'm doing and I want to break the rules". If you want to write in assembly then do that, don't complain that a higher level language isn't assembly.

gedankenexperimenter commented 6 years ago

The reason I say that the bit fields are useless is that it's still necessary to write the hard-to-follow, easy-to-get-wrong bit twiddling code in order to convert the generic integer-based type to the type with bit fields (and back). The whole point of having the bit fields is to avoid having to do that, so what good are they?

What I want isn't a lower level language — quite the opposite. I'm disappointed in the choices made by the writers of the rules, because those rules could have been written so that we would have a clear, reliable solution to this problem, but instead just laid a trap for us to fall into.

noseglasses commented 6 years ago

Why can't you combine the "view" approach with bit-fields?

@jwakely: Could you provide an example how that would look like?

noseglasses commented 6 years ago

I'm not comfortable making decisions about this stuff based on the output of the latest version of gcc for x86 or x86-64. I would -hope- that the results are the same, but I've been bitten before.

I just came up with the test to show what compilers are generally capable off. You are perfectly right that this is better tested on the target platform. If I had been aware of Compiler-Explorer, I would have done so. Wow, this tool looks incredibly useful!

jwakely commented 6 years ago

Could you provide an example how that would look like?

Maybe something like:

struct Keyboard2 {
    uint8_t keyCode_;
    uint8_t mods_ : 5;
    uint8_t meta_ : 3;
    constexpr Keyboard2(Key key)
    : keyCode_(key.keyCode()), mods_(key.flags() >> 3), meta_(key.flags() & 0x7)
    { }
    constexpr uint8_t keyCode() const { return keyCode_; }
    constexpr uint8_t mods() const { return mods_; }
    constexpr uint8_t meta() const { return meta_; }
};

(Or get rid of the accessors and refer to the bit-field members directly, according to preference)

Yes, you have to do some bit-twiddling, but it's in one place. Write it, test it, use it and forget about the implementation details. It's not exactly mind-bending complicated stuff anyway.

You can even make some tests run at compile-time:

static_assert( Keyboard2(Key(13, 59)).keyCode() == 13, "" );
static_assert( Keyboard2(Key(13, 59)).mods() == 7, "" );
static_assert( Keyboard2(Key(13, 59)).meta() == 3, "" );
gedankenexperimenter commented 6 years ago

Once I got a chance to actually sit down and try writing it up, I discovered I had been a little bit more pessimistic than I should have been. I came up with something similar last night:

struct Key {
  uint16_t raw;
  static constexpr uint8_t keyboard_id { 0b000    };
  static constexpr uint8_t consumer_id { 0b001000 };
};

struct KeyboardKey {
  uint16_t keycode : 8, mods : 5, id : 3;

  constexpr
  KeyboardKey() : keycode(0), mods(0), id(Key::keyboard_id) {}

  constexpr
  KeyboardKey(uint8_t _keycode, uint8_t _mods)
    : keycode(_keycode), mods(_mods), id(Key::keyboard_id) {}

  explicit
  KeyboardKey(Key key) : keycode (key.raw           ),
                         mods    (key.raw >>  8     ),
                         id      (key.raw >> (8 + 5))  {
    assert(id == Key::keyboard_id);
  }

  constexpr
  operator Key() const {
    return { static_cast<uint16_t>(keycode            |
                                   mods    <<  8      |
                                   id      << (8 + 5)   ) };
  }
};

struct ConsumerKey {
  uint16_t keycode : 10, id : 6;

  constexpr
  ConsumerKey() : keycode(0), id(Key::consumer_id) {}

  constexpr
  ConsumerKey(uint16_t _keycode) : keycode(_keycode), id(Key::consumer_id) {}

  explicit
  ConsumerKey(Key key) : keycode (key.raw      ),
                         id      (key.raw >> 10)  {
    assert(id == Key::keyboard_id);
  }

  constexpr
  operator Key() const {
    return { static_cast<uint16_t>(keycode       |
                                   id      << 10   ) };
  }
};

In general, I think we want to define the constants like this:

constexpr KeyboardKey Key_A{ 0x04, 0};

The implicit cast from KeyboardKey, et al to Key makes it possible to define the keymap array, and the run-time constructor with the Key type parameter is useful once the correct type has been determined. I think this might also be useful for plugin-specific Key variants.

I'm still not thrilled about it, but using the bit fields this way means simply doing bit shifts, using the same numbers that are used in defining those bit fields, and if there are more than three fields (I've got a few that would do this), the shift could be (key.raw << (8 + 3 + 2)) to make it really clear.

gedankenexperimenter commented 6 years ago

I'm also decidedly leaning toward a preference for defining Key types as classes, not structs, and using accessor functions. That would protect those id members from meddling, and make implementation changes in the future easier.

algernon commented 6 years ago

So... if I understand correctly, with different classes for different kinds of keys, on the keymap, I'd still end up with uint16_ts, and it would be up for the code to figure out which class to use? I'm not sure I like that... not unless we have a SomeSpecialKey::isInstanceOf() or somesuch, that takes an uint16_t, and returns true if the code describes a SomeSpecialKey.

However, this is just a first "feeling". I haven't tried to use the proposed classes yet - doing so may very well change my mind.

gedankenexperimenter commented 6 years ago

That's basically what I've done in my experimental fork, though I had the variant test function take a Key object as its argument, not a raw uint16_t. I also implemented a Key::type() function that returns a KeyType enum class value, so it's possible to write this, which I thought looked nicer:

Key key;
switch (key.type()) {
  case KeyType::keyboard:
    Key::Keyboard keyboard_key{key};
    // keyboard_key stuff…
  case KeyType::layer:
    // et cetera

…but that turned out to produce a significantly larger binary, and run less efficiently than using if blocks and each type's instance-test function directly.

I'm not thrilled with the end result, but it does avoid the trouble with unions not being allowed to be used the way we want.

noseglasses commented 6 years ago

I'm not sure I like that... not unless we have a SomeSpecialKey::isInstanceOf() or somesuch

Only possible if you use more than two bytes per key or reserve bits within the two bytes for the type information. The latter would be a poorly limited approach.

I wouldn't do that either. This discussion originally was mostly about handling the existing bit information safely.

algernon commented 5 years ago

Mkay, I'm getting back to this issue, and I think I agree with @jwakely and @obra, that we do not want to use Key in the case label directly. Key.raw or Key.raw() is fine, but Key? Nah. I'm also not sold on the idea of having many derived classes from Key.

On the other hand there's been plenty of very good discussion in here about unions and their problems. We'll do something about that. Though #527, and the PR I'm building on top of it that converts all brace-initialisations to use a constructor goes a long way already (and remains backwards compatible).

I'll go and re-read all of the comments again to see how to proceed here.

noseglasses commented 5 years ago

The original problem remains (see here). Already the attempt to instantiate a static constexpr Key instance makes the compiler crash. That's likely related to the wrapped union.

See here for a slightly modified version of the Key class that works as expected, does away with the union and makes the compiler happy.

The modified version introduces accessors .keyCode(), .flags() and .raw(). This means breaking changes. Those are, however, easy to fix by adding () behind any .keyCode, .flags and .raw if applied to a keycode, something that can pretty easily be done using automatic text replacement.

I volunteer for providing a PR.

obra commented 5 years ago

@noseglasses:

Is there anything we can do to either make the old stuff warn nicely, at least? I'm a little worried about breaking user code.

noseglasses commented 5 years ago

Is there anything we can do to either make the old stuff warn nicely, at least? I'm a little worried about breaking user code.

I'm not sure how we could warn. As a union-free Key-implementation would not have those original data members anymore, there's nothing like deprecation warnings that we could use.

One could come up with the idea of a horrible hack and define macros raw, keyCode and flags that translate to members of the new Key class that generate warnings, like

#define raw rawWithDeprecationWarning()

class Key {
   ...
   DEPREACATED(...)
   uint16_t rawWithDeprecationWarning() { return raw_; }
};

But that is a no-go because everything that is named raw would be replaced by the macro.

So there's really nothing we can do. Either we keep on living the union or we fix the Key class and accept those breaking changes.

obra commented 5 years ago

...or we come up with new names instead of keyCode, raw and flags? ᐧ

On Thu, Mar 28, 2019 at 4:56 PM noseglasses notifications@github.com wrote:

Is there anything we can do to either make the old stuff warn nicely, at least? I'm a little worried about breaking user code.

I'm not sure how we could warn. As a union-free Key-implementation would not have those original data members anymore, there's nothing like deprecation warnings that we could use.

One could come up with the idea of a horrible hack and define macros raw, keyCode and flags that translate to members of the new Key class that generate warnings, like

define raw rawWithDeprecationWarning()

class Key { ... DEPREACATED(...) uint16t rawWithDeprecationWarning() { return raw; } };

But that is a no-go because everything that is named raw would be replaced by the macro.

So there's really nothing we can do. Either we keep on living the union or we fix the Key class and accept those breaking changes.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/keyboardio/Kaleidoscope/issues/309#issuecomment-477813493, or mute the thread https://github.com/notifications/unsubscribe-auth/AACxaPnTex-PcB8e5eN8a3C121L3p01nks5vbVa3gaJpZM4Sv43s .

noseglasses commented 5 years ago

But that still means a breaking change as there will be no class anymore that supports the old names once the union is gone. Or did I misunderstand your idea with the new names?

noseglasses commented 4 years ago

Closing this as Key has recently been converted to a proper C++ class.