hazelcast / hazelcast-csharp-client

Hazelcast .NET Client
https://hazelcast.com/clients/dotnet/
Apache License 2.0
102 stars 48 forks source link

Compact serialization reading object put by Java throwing "Unknown schema" [API-1801] #764

Closed ph33rtehgd closed 1 year ago

ph33rtehgd commented 1 year ago

I might be jumping the shark on compact serialization, so apologies in advance if it's not ready yet. I've taken the latest develop branch and built the code. I'm attempting to read an object put into a Hazelcast map by a Java process (using compact serialization) in a test .NET client (also using compact serialization). When the .NET client attempts to read the object it throws an exception stating that the schema is unknown. I've attached a snippet of the exception below. I had assumed that when my .NET client connected to the cluster that it would get whatever schema it needed to read the object in question from the cluster but it doesn't seem to be doing that.

image

The Type information in debugging shows the following: {Name = "Object" FullName = "System.Object"}

I'm unsure why it's trying to deserialize something of type Object.

Here is the snippet of the relevant C# code image

The object being written into the map by Java is only using Int32 and String data types using Hazelcast 5.2.1 on the Java side.

Here is my Java serializer that I'm adding to the Java's compact configuration image

github-actions[bot] commented 1 year ago

Internal Jira issue: API-1801

zpqrtbnk commented 1 year ago

Hey - thanks for the detailed report. It is totally OK to experiment with Compact Serialization now, as we are nearing release time. I am now going to try and reproduce the issue, and will come back to you.

zpqrtbnk commented 1 year ago

I have try to reproduce the issue on top of the latest master branch and could not reproduce it.

You will find my effort in my issues/issue-764 branch, more specifically in the Issue764.cs file.

I think I have reproduced your case correctly but don't hesitate to correct me if you see a difference. Could it be that you are not running on the latest master branch? Could you try to pull the latest code and run your tests again?

Let me know how we can assist you - getting Compact Serialization right is important to us!

g-donev commented 1 year ago

Hi @ph33rtehgd,

Thanks for getting in touch. Could please provide more details about your use case?

We would be very interested to learn how you use the client.

Regards, Georgi

ph33rtehgd commented 1 year ago

Hi @ph33rtehgd,

Thanks for getting in touch. Could please provide more details about your use case?

We would be very interested to learn how you use the client.

Regards, Georgi

I'm currently in an experimentation phase. Our primary use case is loading data into a cache/map by Java processes to make available to a .NET UI. It will then use continuous query type functionalities to keep that UI up to date with any new data put into the cache/map. Today we primarily use Apache Geode for this, however Geode development is being discontinued so I'm evaluating alternatives

ph33rtehgd commented 1 year ago

I have try to reproduce the issue on top of the latest master branch and could not reproduce it.

You will find my effort in my issues/issue-764 branch, more specifically in the Issue764.cs file.

I think I have reproduced your case correctly but don't hesitate to correct me if you see a difference. Could it be that you are not running on the latest master branch? Could you try to pull the latest code and run your tests again?

Let me know how we can assist you - getting Compact Serialization right is important to us!

Thanks for taking the time to try and replicate the issue I'm facing. I can confirm that I have the latest commits from master (I had pulled again just in case) and I have rebuilt the code after doing so. I feel like maybe I'm missing something fundamental in the configuration somewhere. When testing, I do the following:

  1. Run hz-start.bat (I have Hazelcast Slim 5.2.1 downloaded locally) to spin up the cluster
  2. Run the Java test program to place the Trade entity into the Map (I'm also able to fetch the Trade entity out of the map by the Predicate as well as by a key lookup and it deserializes through the Compact Serializer with no issues)
  3. Run the C# test program to try and retrieve the same Trade entity, and this is where I get tripped up.

Do I need to change any settings on the Hazelcast cluster to enable compact serialization (or schema distribution)? I tried making the classes for the model object and serializer available to the hazelcast.xml config (which would be undesirable if required, but trying to rule things out) and that didn't make a difference.

Is there a way to see what schemas the C# client has received from the cluster just to verify that it is receiving the schema that would have been placed there by the Java client?

zpqrtbnk commented 1 year ago

A bit of background: when the Java test program places the Trade entity into the map, it uses your custom serializer to serialize the Trade object and to generate the associated schema, with an identifier. When the C# test program retrieves the serialized Trade object, it comes with a schema identifier. The C# code detects that the schema is unknown (yet) and retrieves it from the server. That should all work transparently. Obviously, not in your case.

So back to more testing to figure things out. Stay tuned.

ph33rtehgd commented 1 year ago

A bit of background: when the Java test program places the Trade entity into the map, it uses your custom serializer to serialize the Trade object and to generate the associated schema, with an identifier. When the C# test program retrieves the serialized Trade object, it comes with a schema identifier. The C# code detects that the schema is unknown (yet) and retrieves it from the server. That should all work transparently. Obviously, not in your case.

So back to more testing to figure things out. Stay tuned.

Got it. That's what I had figured was supposed to happen, but was just trying to think of other possibilities. I tried to place this entity into the map using C# (which works), however the Java client is then unable to read the object (same kind of unknown schema ID exception). Based on this it seems like the two clients are identifying the objects with different schema IDs (or at least, when it receives a schema from the other language it doesn't believe that it's a match for its copy of the object). Are there any other attributes that the client uses to match the class to a schema aside from the Type Name?

ph33rtehgd commented 1 year ago

@zpqrtbnk Mystery solved. It does not like me using "ID" as the field name. If I make it all lower case or use "iD" it works fine. In your test case you had used "id" all lower case, which is likely why you didn't see this issue. I saw a comment inside the Schema.cs file that states the following:

// the sorted set of fields, which will be used for fingerprinting - needs to be ordered
// exactly in the same way as Java, which uses Comparator.naturalOrder() i.e. "natural
// order", and good luck finding a definition for this, apart from it being case-sensitive,
// so we're going with whatever seems best in C# and hope it works.

I have a feeling that the issue I faced was related to this. If I change the StringComparer from InvariantCulture to Ordinal on line 146 of this class it continues to work in my case. Obviously I don't know what other implications this might have, but maybe something to consider?

zpqrtbnk commented 1 year ago

Update: the issue has been reproduced and may indeed have to do with the comment you higlighted. Basically, I have reached the same conclusion you have. Will figure it out and then commit a fix. Stay tuned.

zpqrtbnk commented 1 year ago

Update: you were right in your analysis, many thanks for the hints. I have attached a PR which fixes the issue by using Ordinal instead of InvariantCulture + have added more tests to compare the ordering between .NET and Java and ensure we end up ordering fields the same. If you can test your code with the PR applied, that would be great.

Also don't hesitate to share more about your work with @g-donev : he may have a broader vision on Hazelcast's solutions and may be able to help you architect your project.

zpqrtbnk commented 1 year ago

Hello. As we have merged the PR, this issue has been closed by GitHub.

It means that the fix is now in the master branch.

Of course if this was closed too fast and you see anything wrong, please reach out to us!

ph33rtehgd commented 1 year ago

@zpqrtbnk I've taken the latest master that includes the PR changes and noticed some strange behavior. There were no errors when reading objects, however I'd get the wrong data when reading a particular field. For example, I might ReadString("ID") but I'd get the value of a completely unrelated field. After some digging I've noticed one other set of places where this StringComparer.Ordinal is needed. On lines 168, 180 and 200 of the Schema.cs class we need to add StringComparer.Ordinal like in the attached screenshot, otherwise it will sort based on culture as that is the default behavior of CompareTo used here otherwise.

After I've made this change things seem to be working smoothly! Hope this helps.

image

zpqrtbnk commented 1 year ago

Investigating.

zpqrtbnk commented 1 year ago

Reproduced.

zpqrtbnk commented 1 year ago

Confirmed, fixed in PR - many thanks for your precious help!

ph33rtehgd commented 1 year ago

Glad I was able to assist!