real-logic / simple-binary-encoding

Simple Binary Encoding (SBE) - High Performance Message Codec
Apache License 2.0
3.07k stars 520 forks source link

Interface hierarchy for generated types #349

Closed benalexau closed 8 years ago

benalexau commented 8 years ago

SBE Tool currently generates MessageEncoder and MessageDecoder interfaces in the sbe:messageSchema-defined package. The tool also ensures that flyweights generated for each sbe:message implement either MessageEncoder or MessageDecoder.

I am currently developing a library for storing SBE data in the LMDB embedded key-value store (via LMDBJNI), but have encountered several difficulties in evolving this into a general-purpose library of value to others:

To explore what might be feasible, I have produce a proof of concept project. It shows:

I am not advocating for this exact solution, but rather seeking discussion on whether pull requests that modify SBE Tool and Agrona to achieve something along these lines would be reasonable. Of course I’m happy to prepare PRs, but preferred some initial discussion first.

As an aside, I’ll be happy to release the resulting LMDB solution if we can resolve the third-party referencing challenges mentioned above. I expect it will be useful for SBE users seeking fast, embedded, key-ordered storage for their SBE types. The introduction of interfaces such as these would also enable others to build similar SBE-aware libraries.

mjpt777 commented 8 years ago

Would a simple solution be to optionally generate the schema specific interface?

benalexau commented 8 years ago

@mjpt777, do you mean SBE Tool having an option to generate the org.agrona.sbe package and associated interfaces in the user's sbe.output.dir? It would work, but it feels unusual to generate static .java files versus just putting them in a versioned JAR dependency. Or do you mean something else?

mjpt777 commented 8 years ago

It could be as simple as an option to not generate any interfaces?

benalexau commented 8 years ago

Adding an sbe.generate.interfaces (?) option to SBE Tool has some advantages:

If you are happy with this, I can produce a PR. Or if you have other comments on the interface hierarchy etc, please feel free.

mjpt777 commented 8 years ago

Yes. I think it is the best way to begin with. We can then have time to discuss what makes the most sense longer term.

BTW I think it is great that SBE gets used with LMDB :-)

benalexau commented 8 years ago

Issue links:

mjpt777 commented 8 years ago

Adding the empty Structure interfaces adds a lot of complexity to this and the JavaDoc does not help explain why. Could you either update the JavaDoc to include an example of why these are need to consider dropping this interfaces and just using the Flyweights/Encoders/Decoders interfaces?

benalexau commented 8 years ago

I'm happy to provide a PR with more JavaDocs, but if you are finding Structure adding a lot complexity it's perhaps appropriate to consider whether it's worth keeping at all.

My Java annotation processor generates (at compile time) concrete DAO implementations that persist SBE types to LMDB. Here is an example:

import baseline.CarStructure;
import baseline.EngineStructure;

@GenerateDao(maxMessageLength = 2048, key = EngineStructure.class, value = CarStructure.class)
public interface CarDao {
}

As shown, @GenerateDao relies on Structure so it can express the key and value types, with the former constrained to a CompositeStructure and the latter to a MessageStructure (we don't want the SBE message header for a key, as it would impact LMDB key sort order).

In reality I am having to use string concatenation to arrive at the encoder and decoder types anyway within the annotation processor:

    String key = info.keyStruct.getQualifiedName().toString();
    String val = info.valStruct.getQualifiedName().toString();
    assert key.endsWith(STRUCTURE);
    assert val.endsWith(STRUCTURE);
    key = key.substring(0, key.length() - STRUCTURE.length());
    val = val.substring(0, val.length() - STRUCTURE.length());
    .....
    map.put("keyStruct", info.keyStruct.getQualifiedName().toString());
    map.put("keyEncoder", key + ENCODER);
    map.put("keyDecoder", key + DECODER);
    map.put("valStruct", info.valStruct.getQualifiedName().toString());
    map.put("valEncoder", val + ENCODER);
    map.put("valDecoder", val + DECODER);
    map.put("interface", info.iface.getQualifiedName().toString());

As such, it's not an issue in my use case to abolish Structure and change the annotation to require the key to be a CompositeDecoderFlyweight and the value a MessageDecoderFlyweight. It's a small hack in that we are referring to the decoder when we really want to refer to the type, but it's not major issue if you feel Structure adds complexity to the interface hierarchy or output package directory.

If you like I can submit a PR with JavaDocs for Structure interfaces, or a PR which removes Structure, or you are welcome to remove it if you prefer. I don't mind; just let me know.

mjpt777 commented 8 years ago

Thanks for the clarification. This helps a lot.

I think it would be best to drop the Structure interfaces and keep the Flyweights. This will reduce the complexity and avoid the need for overlapping interfaces.

As a separate point have you considering using the id field as your key to flyweight lookup in the map? It would be more efficient and you could use primitive maps.

If you could do a PR to remove that Structure interfaces then that would be much appreciated. I can then move the remaining interfaces to Agrona to simplify further.

benalexau commented 8 years ago

I'll submit a PR removing it early next week (weekend here now).

Thanks for the map ID suggestion. I usually use primitives everywhere, but in this case the map is an input to the Mustache template library which goes off and writes the resulting Java source code. The map keys are needed in the template file, such as private {{keyEncoder}} keyEncoder;. I started writing the source code out directly within the annotation processor, but moving it to a template made it much easier to incrementally refactor the generated code.

mjpt777 commented 8 years ago

New interfaces are now in Agrona.

benalexau commented 8 years ago

Thanks!