In theory it should be possible to deserialize the message data manually, BUT...
Some serialization formats like Avro require to know the exact schema that was used for serialization (https://github.com/mtth/avsc/issues/447). In order to deal with schema evolution the client needs to know it's own compatible schema AND the schema used for serialization.
As far I understand, the automatic (de)serialization feature of Pulsar solves this problem by keeping a schema registry and tagging the messages with the used schema version.
If I understand right, the node client does not provide a method to get the schema used for serialization and not even a method to get the schema version of a message.
Assuming the need for schema evolution, this makes it impossible to deserialize reliably messages written by a java client library using an Avro schema,
I'm new to Pulsar and Avro, so please forgive (and correct) me if my understanding is wrong.
If my understanding is right, I wonder how difficult it would be to add a method to lookup the serialization schema on an message.
There are client libraries for other languages (e.g. java) that support automatic (de)serialization of message data based on a schema (https://pulsar.apache.org/docs/3.1.x/schema-overview/). The node client does not have this feature yet (https://github.com/apache/pulsar-client-node/issues/242).
In theory it should be possible to deserialize the message data manually, BUT... Some serialization formats like Avro require to know the exact schema that was used for serialization (https://github.com/mtth/avsc/issues/447). In order to deal with schema evolution the client needs to know it's own compatible schema AND the schema used for serialization. As far I understand, the automatic (de)serialization feature of Pulsar solves this problem by keeping a schema registry and tagging the messages with the used schema version. If I understand right, the node client does not provide a method to get the schema used for serialization and not even a method to get the schema version of a message. Assuming the need for schema evolution, this makes it impossible to deserialize reliably messages written by a java client library using an Avro schema,
I'm new to Pulsar and Avro, so please forgive (and correct) me if my understanding is wrong.
If my understanding is right, I wonder how difficult it would be to add a method to lookup the serialization schema on an message.