Open sunby opened 3 days ago
insert strings longer than 65530
Is this referring to the length of the entire string or the length of a single word?
insert strings longer than 65530
Is this referring to the length of the entire string or the length of a single word?
it's a const variable MAX_TOKEN_LEN in tantivy, so I think it's a word.
I will do some tests to check what's the influence for query using inverted index.
/assign
how can a token be that long?
We do need to tune the max varchar length field. is there anything stop us from increasing varchar to 256K or 1M?
/assign @sunby /unassign
how can a token be that long?
We do need to tune the max varchar length field. is there anything stop us from increasing varchar to 256K or 1M?
We tested inverted index with 65535 length string and this warning occured.
I write an unit test to verify it. And strings longer than 65530 can not be searched because they are dropped in tantivy.
how can a token be that long?
We do need to tune the max varchar length field. is there anything stop us from increasing varchar to 256K or 1M?
We use "raw" tokenizer which means no tokenizer in tantivy.
how can a token be that long? We do need to tune the max varchar length field. is there anything stop us from increasing varchar to 256K or 1M?
We use "raw" tokenizer which means no tokenizer in tantivy.
This seems to be a non blocker issue.
Is there a blocking issue if we want to grow the size of varchar to 256k? like we use some smaller bits for a size
Is there an existing issue for this?
Environment
Current Behavior
If you insert strings longer than 65530, milvus will not return warnings or errors but tantivy's log will print warning.
Expected Behavior
No response
Steps To Reproduce
No response
Milvus Log
No response
Anything else?
No response