Closed tuteng closed 4 years ago
@codelipenghui @congbobo184
When you find the reason for the bug, could you please pull request and fix it? Thank you.
What is the producer's schema
"{\n" +
" \"type\": \"record\",\n" +
" \"name\": \"Test\",\n" +
" \"fields\": [\n" +
" {\n" +
" \"name\": \"id\",\n" +
" \"type\": [\n" +
" \"null\",\n" +
" \"int\"\n" +
" ]\n" +
" },\n" +
" {\n" +
" \"name\": \"name\",\n" +
" \"type\": [\n" +
" \"null\",\n" +
" \"string\"\n" +
" ]\n" +
" }\n" +
" ]\n" +
" }"
you should change the name to Foo2 and add the namespace, like
ReflectData.get().getSchema(Foo2.class).toString().
the namespace like this output
@congbobo184 the question is - if a user specific a POJO class in the SchemaDefinition, it should return a POJO not a GenericData$Record. This is something I don't quite understand when looking into this issue. Since you introduced SchemaDefinition, would you mind taking a look at this?
@tuteng define the AvroSchema withJsonDef, it is not null. The logic now is to generate the schema defined with withJsonDef first. So, withJsonDef don't define the schema with namespace and the name is not foo2, so generate a GenericData$Record. I think we can change this logical to withPojo first or use one of withJsonDef and withPojo.
I'm getting the same error with a pulsar function where the input is a POJO from an Avro encoded topic. The function works fine if the input topic is JSON encoded.
The problem is we should specify the correct namespace and name in the JSON definition. I have written a demo in java:
package org.apache.pulsar.client.impl;
import org.apache.pulsar.client.api.schema.SchemaDefinition;
import org.apache.pulsar.client.impl.schema.AvroSchema;
import org.testng.Assert;
import org.testng.annotations.Test;
import java.util.Objects;
public class SchemaTest {
@Test
public void test() {
SchemaDefinition<User> sd = SchemaDefinition.<User>builder().withJsonDef("{\n" +
" \"type\": \"record\",\n" +
" \"name\": \"User\",\n" +
" \"namespace\": \"org.apache.pulsar.client.impl.SchemaTest\",\n" +
" \"fields\": [\n" +
" {\n" +
" \"name\": \"id\",\n" +
" \"type\": [\n" +
" \"null\",\n" +
" \"int\"\n" +
" ]\n" +
" },\n" +
" {\n" +
" \"name\": \"name\",\n" +
" \"type\": [\n" +
" \"null\",\n" +
" \"string\"\n" +
" ]\n" +
" }\n" +
" ]\n" +
" }").build();
AvroSchema<User> schema = AvroSchema.of(sd);
User user = new User();
user.setId(1);
user.setName("penghui");
byte[] encoded = schema.encode(user);
User decoded = schema.decode(encoded);
Assert.assertEquals(user, decoded);
}
private static class User {
private Integer id;
private String name;
public Integer getId() {
return id;
}
public void setId(Integer id) {
this.id = id;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
User user = (User) o;
return Objects.equals(id, user.id) &&
Objects.equals(name, user.name);
}
@Override
public int hashCode() {
return Objects.hash(id, name);
}
}
}
Avro use full name(namespace
+ "." + name
) to get Class for decoding data. GenericData uses If class not found. Here is source code of SpecificData
in avro, and SpecificData
is GenericData
's subclass, if the class not found, will call supper.newRecord
, this will introduce GenericRecord
.
public Object newRecord(Object old, Schema schema) {
Class c = getClass(schema);
if (c == null)
return super.newRecord(old, schema); // punt to generic
return (c.isInstance(old) ? old : newInstance(c, schema));
}
@codelipenghui - I think this issue has been seen if a AVRO pojo is used in pulsar functions. It might be worth checking why pulsar functions would encounter this issue.
@sijie Ok, I will take a look.
@codelipenghui @sijie
I have created for testing a source connector that produces dummy users every 1 second and a function that only appends "!!!" to the name of each user: https://github.com/Antti-Kaikkonen/PulsarPojoTest. When I run both the source connector and the function I can observe the error in the log file /tmp/functions/public/default/pojo-function/pojo-function-0.log
. I'm running pulsar 2.5.0 standalone with OpenJDK 11.
Based on @codelipenghui comment I think it might be a class loader issue.
/cc @gaoran10 please also help take a look @Antti-Kaikkonen 's comment.
Describe the bug A clear and concise description of what the bug is.
To Reproduce Steps to reproduce the behavior:
Go to '...'
See error
Expected behavior A clear and concise description of what you expected to happen.
Desktop (please complete the following information):
Additional context Add any other context about the problem here.