Closed larry88 closed 6 months ago
/assign @binbinlv please help to try with pymilvus to check whether it is a sdk issue or not?
/unassign
/assign @binbinlv please help to try with pymilvus to check whether it is a sdk issue or not?
/unassign
Ok, working.
Using pymilvus 2.1.3, milvus 2.1.4, the result is right:
>>>
>>> from pymilvus import CollectionSchema, FieldSchema
>>> from pymilvus import Collection
>>> from pymilvus import connections
>>> from pymilvus import DataType
>>> from pymilvus import Partition
>>> from pymilvus import utility
>>> connections.connect()
>>> int64_field = FieldSchema(name="int64", dtype=DataType.INT64, is_primary=True)
>>> int64_field_1 = FieldSchema(name="int641", dtype=DataType.INT64)
>>> dim=64
>>>
>>> float_vector = FieldSchema(name="float_vector", dtype=DataType.FLOAT_VECTOR, dim=dim)
>>> schema = CollectionSchema(fields=[int64_field, int64_field_1, float_vector])
>>> collection = Collection("test_search_collection_binbin_0", schema=schema)
>>> import numpy as np
>>> res = collection.insert([[9999999999999999], [5625300123280813090], [[0.08090433, 0.19154754, 0.16858263, 0.027101958, 0.07229418, 0.15223257, -0.024227709, 0.13302892, 0.05951315, 0.03572949, -0.015721956, -0.21992287, 0.08134472, 0.18640009, -0.09814235, -0.11117617, 0.10464557, -0.092037976, -0.19489805, -0.069008306, -0.039415136, -0.17841195, 0.076126315, 0.031378396, 0.22680397, 0.045089707, 0.12307317, 0.06711619, 0.15067382, -0.213569, 0.066602595, -0.021743167, -0.2727193, -0.112709574, 0.09504322, 0.02386695, 0.04574049, -0.055642836, -0.16812482, -0.051256, -0.11399734, 0.29519975, 0.109542266, 0.18452083, 0.05543076, -0.064969495, -0.14457555, -0.034600936, 0.045484997, -0.15677887, -0.12983392, 0.20921704, -0.049788076, 0.050687622, -0.23369887, -0.022488454, 0.06089106, 0.14699098, -0.08140416, -0.008949298, -0.14867777, 0.07415456, -0.0027948048, 0.0060837376]] ])
>>>
>>>
>>> res
(insert count: 1, delete count: 0, upsert count: 0, timestamp: 437246367618301954, success count: 1, err count: 0)
>>> res.primary_keys
[9999999999999999]
Using milvus 2.1.4, restAPI, this issue is reproduced: id from 9999999999999999 to 10000000000000000
connect
curl ***:9091/api/v1/health
{"status":"ok"}
create collection:
curl -X 'POST' \
'http://***:9091/api/v1/collection' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"collection_name": "seller_tag",
"schema": {
"autoID": false,
"description": "Test book search",
"fields": [
{
"name": "kid",
"description": "book id",
"is_primary_key": true,
"autoID": false,
"data_type": 5
},
{
"name": "seller_id",
"description": "count of words",
"is_primary_key": false,
"data_type": 5
},
{
"name": "vector",
"description": "embedded vector of book introduction",
"data_type": 101,
"is_primary_key": false,
"type_params": [
{
"key": "dim",
"value": "64"
}
]
}
],
"name": "seller_tag"
}
}'
{}
insert:
curl -X 'POST' 'http://***:9091/api/v1/entities' -H 'accept: application/json' -H 'Content-Type: application/json' \
-d '{
"collection_name": "seller_tag",
"fields_data": [{ "field_name":"kid","type":5,"field":[9999999999999999] },{ "field_name":"seller_id","type":5,"field":[5625300123280813090] },{ "field_name":"vector","type":101,"field":[[0.08090433, 0.19154754, 0.16858263, 0.027101958, 0.07229418, 0.15223257, -0.024227709, 0.13302892, 0.05951315, 0.03572949, -0.015721956, -0.21992287, 0.08134472, 0.18640009, -0.09814235, -0.11117617, 0.10464557, -0.092037976, -0.19489805, -0.069008306, -0.039415136, -0.17841195, 0.076126315, 0.031378396, 0.22680397, 0.045089707, 0.12307317, 0.06711619, 0.15067382, -0.213569, 0.066602595, -0.021743167, -0.2727193, -0.112709574, 0.09504322, 0.02386695, 0.04574049, -0.055642836, -0.16812482, -0.051256, -0.11399734, 0.29519975, 0.109542266, 0.18452083, 0.05543076, -0.064969495, -0.14457555, -0.034600936, 0.045484997, -0.15677887, -0.12983392, 0.20921704, -0.049788076, 0.050687622, -0.23369887, -0.022488454, 0.06089106, 0.14699098, -0.08140416, -0.008949298, -0.14867777, 0.07415456, -0.0027948048, 0.0060837376]] }],
"num_rows": 1
}'
{"status":{},"IDs":{"IdField":{"IntId":{"data":[10000000000000000]}}},"succ_index":[0],"insert_cnt":1,"timestamp":437249762235777026}
Reproduced only in restAPI, so maybe the sdk issue, @haorenfsa could you help to have a look please? Thanks.
/assign @haorenfsa /unassign
It's a bug due to careless golang interface conversion in parsing request. It's not an overflow but a precision lost when converting from float64 to int64.
here is a simple proof test of it:
func Test_convert(t *testing.T) {
type val struct {
A interface{} `json:"a"`
}
raw := `{"a": 9999999999999999}`
var v val
err := json.Unmarshal([]byte(raw), &v)
assert.Nil(t, err)
assert.Equal(t, int64(9999999999999999), int64(v.A.(float64)))
}
which outputs
Error: Not equal:
expected: 9999999999999999
actual : 10000000000000000
Test: Test_convert
Same problem may also apply to other sdks, whose language uses float64 for number types. eg: node sdk with typescript. Can you help to test & prove it?
Same problem may also apply to other sdks, whose language uses float64 for number types. eg: node sdk with typescript. Can you help to test & prove it?
Unmarshal
So json unmarshal treat all data types as float?
So json unmarshal treat all data types as float?
all number
data, yes, as well as many other major json libs
FYI It's stated in the RFC standard of JSON: https://datatracker.ietf.org/doc/html/rfc8259
This specification allows implementations to set limits on the range
and precision of numbers accepted. Since software that implements
IEEE 754 binary64 (double precision) numbers [[IEEE754](https://datatracker.ietf.org/doc/html/rfc8259#ref-IEEE754)] is generally
available and widely used, good interoperability can be achieved by
implementations that expect no more precision or range than these
provide, in the sense that implementations will approximate JSON
numbers within the expected precision. A JSON number such as 1E400
or 3.141592653589793238462643383279 may indicate potential
interoperability problems, since it suggests that the software that
created it expects receiving software to have greater capabilities
for numeric magnitude and precision than is widely available.
Note that when such software is used, numbers that are integers and
are in the range [-(2**53)+1, (2**53)-1] are interoperable in the
sense that implementations will agree exactly on their numeric
values.
@larry88 thanks again for this feedback. It really saved us from many troubles to come.
Same problem may also apply to other sdks, whose language uses float64 for number types. eg: node sdk with typescript. Can you help to test & prove it?
@nameczz Yes, same problem using Node SDK. insert: (id) 9999999999999999 return: (id) 10000000000000000 node.js:
const {MilvusClient} = require("@zilliz/milvus2-sdk-node")
const address = "***:19530";
const milvusClient = new MilvusClient(address);
const params = {
collection_name: "book",
description: "Test book search",
fields: [
{
name: "book_id",
data_type: 5, //DataType.Int64
is_primary_key: true,
description: "",
},
{
name: "book_id_1",
data_type: 5, //DataType.Int64
description: "",
},
{
name: "book_intro",
description: "",
data_type: 101, // DataType.FloatVector
type_params: {
dim: "64",
},
},
],
};
(async () => {
await milvusClient.collectionManager.createCollection(params);
const data = [{"book_id":9999999999999999,"book_id_1":5625300123280813090, "book_intro":[0.08090433, 0.19154754, 0.16858263, 0.027101958, 0.07229418, 0.15223257, -0.024227709, 0.13302892, 0.05951315, 0.03572949, -0.015721956, -0.21992287, 0.08134472, 0.18640009, -0.09814235, -0.11117617, 0.10464557, -0.092037976, -0.19489805, -0.069008306, -0.039415136, -0.17841195, 0.076126315, 0.031378396, 0.22680397, 0.045089707, 0.12307317, 0.06711619, 0.15067382, -0.213569, 0.066602595, -0.021743167, -0.2727193, -0.112709574, 0.09504322, 0.02386695, 0.04574049, -0.055642836, -0.16812482, -0.051256, -0.11399734, 0.29519975, 0.109542266, 0.18452083, 0.05543076, -0.064969495, -0.14457555, -0.034600936, 0.045484997, -0.15677887, -0.12983392, 0.20921704, -0.049788076, 0.050687622, -0.23369887, -0.022488454, 0.06089106, 0.14699098, -0.08140416, -0.008949298, -0.14867777, 0.07415456, -0.0027948048, 0.0060837376]}]
const mq = await milvusClient.dataManager.insert({
collection_name: "book",
fields_data: data,
});
console.log(mq);
await milvusClient.collectionManager.dropCollection({ collection_name: "book",});
console.log("Dropped collction");
console.log(mq.IDs.int_id.data[0]);
})();
Bulkinsert also has the same problem. I have submitted a commit for bulkinsert: https://github.com/milvus-io/milvus/pull/20506/files
The encoding/json lib uses float64 for all numeric values by default. We can ask the json lib to return string value instead of float64 value in this way:
dec := json.NewDecoder(r)
dec.UseNumber()
The json decoder will return a string type (json.Number) for us, then we can convert the string type value to int64 value in this way:
value, err := strconv.ParseInt(string(obj.(json.Number)), 10, 64)
/assign @haorenfsa would you like to fix it?
/assign @haorenfsa would you like to fix it?
Fix in #20468, need review
/reopen
solved for:
not solved for:
@haorenfsa: Reopened this issue.
@haorenfsa any updates?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.
keep it active
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.
/reopen
solved for:
- [x] bulkinsert
- [x] restapi
not solved for:
- [ ] node sdk
- [ ] attu
@yanliang567 still same as above
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.
/reopen
@haorenfsa: Reopened this issue.
The issue with the primary key precision in Milvus is identified as a bug in the REST API and Node SDK due to the handling of int64 fields. As noted, the problem arises from the JSON unmarshaling process, which treats numeric values as float64, leading to precision loss.
For immediate action:
A permanent fix is in progress, and the relevant pull requests and updates will address this issue in upcoming releases. Please keep an eye on the release notes for version 2.2.13 for the resolution.
Thank you for your patience and contributions to improving Milvus.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.
@haorenfsa was it fixed?
nope
related to #27856
/assign @PowderLi /assign @shanghaikid
@yanliang567 current status:
solved for:
not solved for:
/assign @binbinlv
It works well in restful v2 api
method: post,
url: http://10.104.26.106:19530/v2/vectordb/entities/insert,
cost time: 0.1798231601715088,
header: {'Content-Type': 'application/json', 'Authorization': 'Bearer root:Milvus', 'Accept-Type-Allow-Int64': 'true', 'RequestId': 'dbc24f8a-d6cf-11ee-9db0-b29c4acc8bb7'},
payload: {"collectionName": "test_collection_2024_02_29_14_57_51_418020wtbZttvu", "data": [{"book_id": 9999999999999999, "user_id": 0, "word_count": 0, "book_describe": "book_0", "text_emb": [0.634495385666520...61772868946]}, {"book_id": 10000000000000000, "user_id": 1, "word_count": 1, "book_describe": "book_1", "text_emb": [0.4500510428322396, 0.27717461703918467, 0.06460117570971208, 0.846436635910638]}]},
response: {"code":200,"data":{"insertCount":2,"insertIds":[9999999999999999,10000000000000000]}}
Using milvus "master-20240229-dcfc3531-amd64" restAPI V1, this issue is fixed too:
result: insert 9999999999999999:
{}{"status":{},"IDs":{"IdField":{"IntId":{"data":[9999999999999999]}}},"succ_index":[0],"insert_cnt":1,"timestamp":448054154404823044}
insert 10000000000000000:
{}{"status":{},"IDs":{"IdField":{"IntId":{"data":[10000000000000000]}}},"succ_index":[0],"insert_cnt":1,"timestamp":448054226416041987}
@yanliang567 current status:
solved for:
not solved for:
https://github.com/milvus-io/milvus-sdk-node/pull/279
@haorenfsa node system doesn't support int64 natively, I've added one solution which is using long.js to insert or resolve this data-type.
/unassign
https://github.com/milvus-io/milvus-sdk-node/pull/279 @haorenfsa node system doesn't support int64 natively, I've added one solution which is using long.js to insert or resolve this data-type.
@binbinlv please help verify
@haorenfsa node system doesn't support int64 natively, I've added one solution which is using long.js to insert or resolve this data-type.
I think there is no need to test node-sdk now, node system doesn't support int64, so there will be issues for the following code, insert "9999999999999999" using nodesdk, and it returns 10000000000000000. And if tests the solution, I found that @shanghaikid has already add the test case in the repo.
(async () => {
await milvusClient.collectionManager.createCollection(params);
const data = [{"book_id":9999999999999999,"book_id_1":5625300123280813090, "book_intro":[0.08090433, 0.19154754, 0.16858263, 0.027101958, 0.07229418, 0.15223257, -0.024227709, 0.13302892, 0.05951315, 0.03572949, -0.015721956, -0.21992287, 0.08134472, 0.18640009, -0.09814235, -0.11117617, 0.10464557, -0.092037976, -0.19489805, -0.069008306, -0.039415136, -0.17841195, 0.076126315, 0.031378396, 0.22680397, 0.045089707, 0.12307317, 0.06711619, 0.15067382, -0.213569, 0.066602595, -0.021743167, -0.2727193, -0.112709574, 0.09504322, 0.02386695, 0.04574049, -0.055642836, -0.16812482, -0.051256, -0.11399734, 0.29519975, 0.109542266, 0.18452083, 0.05543076, -0.064969495, -0.14457555, -0.034600936, 0.045484997, -0.15677887, -0.12983392, 0.20921704, -0.049788076, 0.050687622, -0.23369887, -0.022488454, 0.06089106, 0.14699098, -0.08140416, -0.008949298, -0.14867777, 0.07415456, -0.0027948048, 0.0060837376]}]
const mq = await milvusClient.dataManager.insert({
collection_name: "book",
fields_data: data,
});
console.log(mq);
await milvusClient.collectionManager.dropCollection({ collection_name: "book",});
console.log("Dropped collction");
console.log(mq.IDs.int_id.data[0]);
})();
@shanghaikid by the way, is attu's code updated to use the loaderOptions ?
Attu is not updated. it doesn't have a display issue, by default, all int64 data will be converted to string type.
Nothing is Impossible to Willing Heart!!!
On Fri, Mar 8, 2024 at 11:12 AM shaoyue @.***> wrote:
@shanghaikid https://github.com/shanghaikid by the way, is attu's code updated to use the loaderOptions ?
— Reply to this email directly, view it on GitHub https://github.com/milvus-io/milvus/issues/20415#issuecomment-1984968624, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABNFWZ3KQNNBZFVJRHKHG3YXEUBZAVCNFSM6AAAAAAR2KMYCGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBUHE3DQNRSGQ . You are receiving this because you were mentioned.Message ID: @.***>
@yanliang567 ok to close, all solved
/assign @yanliang567
/assign @zhuwenxing could you please help to verify the fixes /unassign
Is there an existing issue for this?
Environment
Current Behavior
when I insert data, the key number is less than 64 bits, but it will overflow
Insert number: 9999999999999999 Max Int64: 9223372036854775807
Expected Behavior
The inserted primary key number is equal with the returned result
Steps To Reproduce
Milvus Log
no
Anything else?
no