manticoresoftware / manticoresearch-buddy

Manticore Buddy is a Manticore Search's sidecar which helps it with various tasks
GNU General Public License v3.0
20 stars 2 forks source link

No float_vector support in Auto schema #377

Open PavelShilin89 opened 1 month ago

PavelShilin89 commented 1 month ago

Bug Description:

Data with vectors is successfully inserted into Manticore. However, when querying the data, the vectors are returned as 0,0,0,0 instead of the expected values. The issue seems to be related to the fact that the data type for vectors in the schema might not support the float_vector format.

MRE

docker run -it -e EXTRA=1 --name manticore -d ghcr.io/manticoresoftware/manticoresearch:test-kit-latest bash
docker exec -it manticore bash
root@24022aa2ed30:/# searchd
rroot@24022aa2ed30:/# mysql -h0 -P9306 -e "INSERT INTO test(id, vector) VALUES (1, (0.1, 0.2, 0.3, 0.4)), (2, (0.5, -0.3, 0.9, -0.1));"
root@24022aa2ed30:/# mysql -h0 -P9306 -e "select * from test"
+------+---------+
| id   | vector  |
+------+---------+
|    1 | 0,0,0,0 |
|    2 | 0,0,0,0 |
+------+---------+
root@24022aa2ed30:/# mysql -h0 -P9306 -e "describe test"
+--------+--------+------------+
| Field  | Type   | Properties |
+--------+--------+------------+
| id     | bigint |            |
| vector | mva    |            |
+--------+--------+------------+

Manticore Search Version:

Latest dev version

Operating System Version:

Ubuntu Jammy

Have you tried the latest development version?

Yes

Internal Checklist:

To be completed by the assignee. Check off tasks that have been completed or are not applicable.

- [x] Implementation completed - [x] Tests developed - [x] Documentation updated - [ ] Documentation reviewed - [ ] Changelog updated
Nick-S-2018 commented 3 weeks ago

We need to test if clt tests are passed correctly after we added support for float vector type values to auto schema. A branch with the fix is here: https://github.com/manticoresoftware/manticoresearch-buddy/tree/refs/heads/fix/autoschema_float_vector

Nick-S-2018 commented 1 week ago

Eventually, we decided to forbid the existing float vector insert syntax in autoschema. Instead, JSON insert syntax must be used. E.g.: INSERT INTO test(id, vector) VALUES (1, '[0.1, 0.2, 0.3, 0.4]'), (2, '[0.5, -0.3, 0.9, -0.1]');

@PavelShilin89 please, update respective clt tests to align with this change.

PavelShilin89 commented 1 week ago

@Nick-S-2018 The issue found is that the vector field is displayed as TEXT, although it should have the expected type JSON. This should be fixed to ensure that the vector data type is correctly reflected in the table structure.

Nick-S-2018 commented 1 week ago

Added the fix in https://github.com/manticoresoftware/manticoresearch-buddy/pull/387/commits/43b347c05721f23c969150ea50a8e034450cb2f5

PavelShilin89 commented 5 days ago

@Nick-S-2018 Testing is done in this PR - https://github.com/manticoresoftware/manticoresearch/pull/2762, please check test-auto-schema.rec and merge to master

PavelShilin89 commented 5 days ago

@Nick-S-2018 I also haven't found any documentation updates on this topic.

Nick-S-2018 commented 4 days ago

Updated documentation in https://github.com/manticoresoftware/manticoresearch/pull/2762/commits/6630d0dbb418d996a66501aa057e9f100dbf5b74