[OEP 5] Query and ResultSet API

tglman commented 8 years ago

Summary: Simple query API that support streaming results and complex results like graph portions, as well as simple record fetching.

Goals:

Streaming like, that allow iteration and interruption of query result iteration.
Not bound to records like ODocument, a result can not be a record
Uniform and that allow to support any result set from query command and scripts
Simplified for the core query language support (sql)

Non-Goals:

Be compatible with third party query engine (TinkerPop stack)
Prefetching and Network will discussed in future

Motivation:

Current API is not streaming
ODocument based result set create doesn't cover well projection like results and add not needed overhead on result access iteration

Description: new database api for direct query without wrapping objects:

ResultSet query(String query, Map<String,Object> args) // for select query
ResultSet command(String command, Map<String,Object> args)// for data manipulation
ResultSet script(String lang,String script, Map<String,Object> args)// script execution

new database api for prepare :

Prepared prepareQuery(String query) // for select query
Prepared prepareCommand(String query)// for data manipulation
Prepared prepareScript(String lang,String script) // script execution

Prepered execution:

ResultSet execute(Map<String,Object> args)

ResultSet access:

boolean hasNext() // iterator like hasNext
OResult next() ` // iterator like next
boolean hasPrevious() // backward iteration
OResult previous() // backward iteration
void close() //Important to stop streaming api
int count() // count of the record in the result set
boolean fixedKeys() check if a result set has fixed number of keys, like in case of specified projections
Set<String> keys() set of keys in case of fixed keys result set.

For OResult value access will follow, ODocument/OVertex/OEdge api, OResult can not be a record, so it will have api for check this and access to the relative record in case

boolean isProjection()
ODocument getRecord()

As well the OResult have to allow traversal of projected relations, following the base api.

early stage examples: https://gist.github.com/tglman/c3a3ceb6a9891ceae3eb4701a69ba0e3

Alternatives:

None known yet

Risks and assumptions:

Braking api, we need to include a compatibility layer with the past for early versions.

Impact matrix

[ ] Storage engine
[x] SQL
[ ] Protocols
[ ] Indexes
[ ] Console
[x] Java API
[ ] Geospatial
[ ] Lucene
[ ] Security
[ ] Hooks
[ ] EE

luigidellaquila commented 8 years ago

+1 to the API.

About OResult hierarchy, I'm not so convinced, what's the advantage compared to having ODocument? The overhead on the server can be completely avoided (no instantiation of ODocument, just write on the stream) and on the client it should not be a problem...

luigidellaquila commented 8 years ago

To be more explicit, are you proposing the following?

OElement interface with get/setProperty()
ODocument implements OElement, OIdentifiable
OVertexImpl implements OElement, OIdentifiable, OVertex
OEdgeImpl implements OElement, OIdentifiable, OEdge
OResult implements OElement (no OIdentifiable)

tglman commented 8 years ago

@luigidellaquila

yes, I don't wont the result to be an Identifiable, and as well keep the cost of all the document tracking structures, not needed for just get a value

exactly OResult is OElement + Set<OResult> getVertices(Direction direction,String label) for projected relations, we could move some of the api on the top structure maybe.

probably we will need something like geEdges as well

luigidellaquila commented 8 years ago

So it's mainly for a performance concern, and only on the client side...?

tglman commented 8 years ago

Not to be an Identifiable is because in that case we need to provide an id everytime that is confusing for the projection case (see the today #-2:n), in embedded as well a projection query should not return documents, for both consistency and performance reasons

lvca commented 8 years ago

@tglman could you write an example of executing a query that return 1 record and update it?

luigidellaquila commented 8 years ago

I think the idea is this (@tglman correct me if I'm wrong)

db.query("select form V")
  .stream()
  .map(r -> r.asDocument())
  .foreach(doc->{
    doc.setProperty("foo", "bar");
    db.save(doc);
  });

lvca commented 8 years ago

The fact that OrientDB v3 will require Java8, it doesn't mean that users are forced to use the lambda syntax. And without using streams it should be:

OResultSet resultset = db.query("select form V");
for( OResult r : resultset ){
  ODocument doc = r.asDocument();
  doc.setProperty("foo", "bar");
  db.save(doc);
};

While I see the benefits of having a resultset instead if a List, I don't see the need to have an OResult and then retrieve the document with.asDocument(). It's a complication of current API. If the only reason is to avoid a projection is saved back, just don't allow to save any document with a RID as projection.

luigidellaquila commented 8 years ago

Yep, it was my point as well. @tglman point is that an OResult can be a projection (without an OIdentity) and we can avoid at compile time to let users save it...

tglman commented 8 years ago

well, the point is that a result set is not needed to be a document, actually there are plenty of cases where is not, like: projections, aggregation functions, nested projection, match statement with multiple returns, command operation returns, batch scripts returns, force everything to be a document it seams to be an overkill for just the case where you extract records.

as well there are case where the behaviour can be confusing like: select *,"name" as name from V the result have to have the field name and have to have a valid identity, but you are not allowed to save it, the 'asDocument()' api in this case will give you out the document without the filed name, ready to be manipulated.

As last detail, the implamentation of a OResult, it's not going to be a document in most of the cases, without overhead like tracking ecc. in some cases will be more similar to a list of arrays.

luigidellaquila commented 8 years ago

For the case of SELECT *, foo FROM..., imho the result should not have an identity, and all the fields of @this should be at the same level of foo, so it could be considered as a normal projection

lvca commented 8 years ago

I agree that in the @luigidellaquila example, it shouldn't have an identity. About all the things you don't need of ODocument I agree: we could keep super light result sets in RAM and I can see the benefit of it. So in the case of select from V ODocument instances are returned, right? Or do you create ODocument at the fly at asDocument() call?

tglman commented 8 years ago

@lvca not sure yet if the select from V create the ODocument directly or later, i would say as first implementation directly, but could be optimized in the long run. consider as well that we could provide Iterable<ODocument> db.query("select from v ").asDocuments() or something like this.

tglman commented 8 years ago

by the way OResult should not have any public 'set' methods, speaking about hierarchies OResult would actually fit well on top of OElement for the method that it provide.

luigidellaquila commented 7 years ago

Little addition to the ResultSet API:

class ResultSet{
...
  Optional<OExecutionPlan> getExecutionPlan();

  Map<String, Object> getQueryStats();

}

luigidellaquila commented 6 years ago

Implemented, closing

orientechnologies / orientdb-labs

[OEP 5] Query and ResultSet API #5