In pulsar JSON schema, When data is serialized, the legitimacy of the data cannot be verified. This can lead to data and schema incompatibility within a topic.
Then, when this connector deals with these messages, the Object value = record.getField(field); value may be null.
In fact, for the JSON schema of cloud storage, there is no requirement for schema compatibility, and we can directly send the original JSON data to cloud storage
Modifications
ead JSON directly from the original data when formatType=json
Verifying this change
Add testJsonIgnoreSchemaRead to cover it.
Documentation
Check the box below.
Need to update docs?
[ ] doc-required
(If you need help on updating docs, create a doc issue)
Motivation
In pulsar JSON schema, When data is serialized, the legitimacy of the data cannot be verified. This can lead to
data
and schemaincompatibility
within a topic.Then, when this connector deals with these messages, the
Object value = record.getField(field);
value may be null.https://github.com/streamnative/pulsar-io-cloud-storage/blob/b2b28ddc60c83b69421bd8e03fe9524c61deb2b4/src/main/java/org/apache/pulsar/io/jcloud/format/JsonFormat.java#L134-L148
In fact, for the JSON schema of cloud storage, there is no requirement for schema compatibility, and we can directly send the original
JSON data
to cloud storageModifications
Verifying this change
testJsonIgnoreSchemaRead
to cover it.Documentation
Check the box below.
Need to update docs?
[ ]
doc-required
(If you need help on updating docs, create a doc issue)
[ ]
no-need-doc
(Please explain why)
[x]
doc
(If this PR contains doc changes)