ArcadeData / arcadedb

ArcadeDB Multi-Model Database, one DBMS that supports SQL, Cypher, Gremlin, HTTP/JSON, MongoDB and Redis. ArcadeDB is a conceptual fork of OrientDB, the first Multi-Model DBMS. ArcadeDB supports Vector Embeddings.
https://arcadedb.com
Apache License 2.0
480 stars 59 forks source link

HTTP-API: "limit" request property ignored for non-SQL #1661

Closed gramian closed 1 month ago

gramian commented 1 month ago

ArcadeDB Version:

ArcadeDB Server v24.6.1-SNAPSHOT (build d21f2f50234cec7f64bf4ab1a1118b2e76c5f0a8/1721165296536/main)

OS and JDK Version:

Running on Mac OS X 12.7.5 - OpenJDK 64-Bit Server VM 17.0.12 (Homebrew)

Expected behavior

The limit property in a query or command request returns the limited amount of items.

Actual behavior

The limit property seems to work only for SQL. For Cypher and Gremlin it fails and returns all query results. I assume it also fails for MQL and GraphQL.

Steps to reproduce

Prepare dataset in a database test:

CREATE VERTEX TYPE vec;
INSERT INTO vec;
INSERT INTO vec;
INSERT INTO vec;
INSERT INTO vec;
INSERT INTO vec;

SQL limit works:

wget -qO- http://localhost:2480/api/v1/query/test --post-data='{"language":"sql","command":"SELECT FROM vec","limit":3}' --user=root --password=arcadedb

Cypher limit fails, returns all vertices:

wget -qO- http://localhost:2480/api/v1/query/test --post-data='{"language":"cypher","command":"MATCH (m:vec) RETURN m","limit":3}' --user=root --password=arcadedb

Gremlin limit fails, returns all vertices:

wget -qO- http://localhost:2480/api/v1/query/test --post-data='{"language":"gremlin","command":"g.V().hasLabel(\"vec\")","limit":3}' --user=root --password=arcadedb
gramian commented 1 month ago

Here is the problem: https://github.com/ArcadeData/arcadedb/blob/main/server/src/main/java/com/arcadedb/server/http/handler/PostCommandHandler.java#L78

gramian commented 1 month ago

Cypher has also a LIMIT keyword: https://neo4j.com/docs/cypher-manual/current/clauses/limit/ Gremlin will be somewhat harder with the .limit(x) method: https://kelvinlawrence.net/book/Gremlin-Graph-Guide.html#limit

lvca commented 1 month ago

The limit in the get /query is used to cut the result, not in the query itself. So it should work with any language because it acts when it serializes the resultset. Checking the code, I see the limit is not used only in the graph serializer that I assume is the one you're using. Fixing it right now.

gramian commented 1 month ago

Your fix seems to work, but is off by one, meaning one element is missing, so a limit of 2 returns only 1 vertex. for limits 2 and 3 only 1 is returned. From limit 4 on it seems to work as expected.

This still seems to work only for SQL and SQLscript. The fix only affects all languages for limits 1,2,3 where only one vertex is returned.

Thanks for the quick fix so far

lvca commented 1 month ago

Not sure to understand if it's fixed or not, but I've pushed another fix that consider the limit also for vertices and edges across all the serializers

gramian commented 1 month ago

Sorry, no it still does not fix it. The problem remains:

gramian commented 1 month ago

Fixed by https://github.com/ArcadeData/arcadedb/commit/df5e8d727c4820e7e8a2f52e681830d4137a7610 .