milvus-io / milvus-sdk-java

Java SDK for Milvus.
https://milvus.io
Apache License 2.0
395 stars 165 forks source link

Dimension Mismatch Error in BinaryVector Insertion in Milvus SDK v2.4.x #1147

Closed Yiling1f closed 3 weeks ago

Yiling1f commented 4 weeks ago

Description In Milvus SDK v2.4.x (Java), inserting BinaryVector data of the correct dimension (e.g., 8) causes a dimension mismatch error, even when the ByteBuffer length matches the schema's dimension setting. The error message returned is:

Incorrect dimension for field 'vector': the no.0 vector's dimension: 0 is not equal to field's dimension: 8

Debugging shows that this issue stems from the SDK's use of ByteBuffer.position() in the calculateBinVectorDim method, which incorrectly retrieves the current byte position instead of the full byte length. This causes real_dim to be computed as 0, leading to an unnecessary dimension mismatch exception.

Steps to Reproduce Create a collection with a BinaryVector field, setting the field's dimension to 8. Attempt to insert the following data:

ByteBuffer buffer = ByteBuffer.wrap(new byte[]{255}); List<InsertParam.Field> fields = new ArrayList<>(); fields.add(new InsertParam.Field("vector", Collections.singletonList(buffer))); InsertParam insertParam = InsertParam.newBuilder() .withCollectionName("test") .withFields(fields) .build(); R<MutationResult> insertResult = milvusClient.insert(insertParam);

Observe the dimension mismatch error returned by the SDK.

Suggested Fix Modify the calculateBinVectorDim method to use v.limit() instead of v.position() when determining the byte length of the ByteBuffer, ensuring accurate dimensional calculations. For example: int real_dim = calculateBinVectorDim(dataType, v.limit());

Thank you to the Milvus team for your attention to this issue.

xiaofan-luan commented 4 weeks ago

@Yiling1f

good catch. could you help on fix that?

Yiling1f commented 4 weeks ago

Thank you for the response! I’d be happy to help with the fix.

yhmo commented 3 weeks ago

@Yiling1f Thank you for pointing out the bug and the pr. I just double-checked all the places that call the position() method, replaced them by limit() with this pr, and added more test code to cover this case: https://github.com/milvus-io/milvus-sdk-java/pull/1154

For ByteBuffer objects generated by ByteBuffer.wrap(), the position() is zero. The previous test cases generate ByteBuffer objects by ByteBuffer.allocate() and ByteBuffer.put(), so the position() is always equal to limit().

I will close your pr since there are more lines to change. Thank you for your effort!

Yiling1f commented 3 weeks ago

Very glad that identifying the issue could help the project! It's a bit of a pity that I couldn't contribute this time (I missed out on some company performance QAQ), but I hope there will be more opportunities to make a contribution in the future. Congratulations!