Closed metametametameta closed 2 years ago
Hi @metametametameta
For now there is no support for "dynamic" edge labels in the new Multi-model API (and I'd say it's too late to have it in the first GA).
We can consider to add it later though.
What you can do, in the short term, is to manually add a "label" field to edge records and then filter based on that attribute when querying. I understand it's a bit trickier in terms of query syntax and API usage.
I'm flagging this as an enhancement for now
Thanks
Luigi
Hi Luigi,
Thanks for your response and we hope this enhancement will be implemented post first GA. We can pursue two alternate solutions in the meantime - see below:
1) Store label field on edge as you have suggested. In this case (using the syntax outlined at http://orientdb.com/orientdb-improved-sql-filtering/), I have something like
select expand(out()[label='myLabel']) from MyClazz
How would this query perform relative to the "standard" way below? A given vertex can have say 6 to 7 different outgoing edge labels - but often we just want to traverse one edge label.
select expand(out('myLabel')) from MyClazz
2) Another option would be to create a huge number of edge classes - would there be any problem dong this? Is there any limit to the number of edge classes used (other than disk space). We would probably have 2 clusters per each unique edge class - but we can likely go over 32K distinct edge classes.
Thanks, Harish.
Also, a few design suggestions for dynamic edge labels when it gets implemented.
1) In Tinkerpop 2.x, it looks like if we choose dynamic edge labels, then it's no longer possible to use class-based edge labels (i.e. it's one or the other). Ideally, the multi-model API should allow both cases to co-exist, i.e. if an edge class exists, use that otherwise use dynamic labels (i.e. the default "E" class).
2) There is a pattern that seems to appear over and over with dynamic edge labels. For faster traversal, we often create multiple variations of an edge label depending on the "path" we want to traverse. Assuming that the edge classes are not related via subclassing (as is the case with dynamic edge labeling), having a wildcard match on the edge label would be quite helpful.
So for example, if we have several dynamic edge labels that all have a common prefix 'Friend', then something like outE('Friend*') would be quite useful since there is no polymorphism possible on a common base edge class.
Hi @metametametameta
the actual query would be something like
select expand(outE()[label='myLabel'].inV()) from MyClazz
You can also use OR conditions to filter based on multiple labels.
If you already know that you will have tens of thousands of edge labels, I'd go with option 1
Thanks
Luigi
Ok, got it. My remaining question is whether
select expand(outE()[label='myLabel'].inV()) from MyClazz
will have different performance from
select expand(out('myLabel')) from MClazz
when multiple labels are present on the vertex. i.e. does the first query above traverse all these other labels and take all but one of them out "later" when it sees the label filter?
Hi @metametametameta
when you use different classes for edges, also the edge links are stored on different collections on vertex records. Eg. if you have two edge classes Foo and Bar, you will have two distinct out
properties on the edges, ie. out_Foo
and out_Bar
In this situation, when you traverse out("Foo")
, the engine will only inspect out_Foo
collection and will discard out_Bar
.
If you represent edge types as labels, you will have only a single out
field on the vertex, so when you traverse out("Foo")
the engine will have to fetch all the edges (both Foo and Bar), filter by label and then fetch the corresponding vertices.
As you can understand, you will pay it a little bit in terms of performance, but if you don't have too many edges per vertex it is typically not an issue
Thanks
Luigi
That might be an issue for us as we have many edges per vertex (with different labels) - so the partitioning by distinct properties is quite important for performance reasons. We may have to go with Option 2 to avoid the performance hit. Do you see any problems with using a large number of edge classes/clusters? I believe 3.0 now uses an Integer for cluster ids - so the older 32K limit no longer applies?
There is no particular issue having a large number of clusters, but the limit of 32k still applies unfortunately
Thanks
Luigi
Hi Luigi,
Thanks for your quick response. We might be able to get way with the 32K limit right now. Any plans to extend the 32K limit in the future?
Harish.
Hi @metametametameta
We had it in the roadmap, but for now it's on hold, we don't have a specific scheduling for it
Thanks
Luigi
OrientDB Version: 3.0.0 RC1
Java Version: Java 8
OS: Windows 8.1
Expected behavior
We're currently migrating to Orient 3.x and are rewriting our code to use the multi-model API. Previously, we mainly used the Tinkerpop 2.x Orient integration. We'd like to exclusively use the Orient Native APIs now. We also use native Orient (graph) queries heavily.
We're running into a major issue with "edge classes" in the multi-model API.
Our application tends to have a huge number of possible edge labels - so it's not feasible to create an edge class for each such label. In Tinkerpop 2.x, we were able to use dynamic labels quite easily using the following configuration:
ALTER DATABASE custom useClassForEdgeLabel=false ALTER DATABASE custom useVertexFieldsForEdgeLabels=true
This way, in Tinkerpop 2.x, we only have the single default "E" class - and are able to create additional edges with whatever labels (at runtime) that do not require edge classes. Moreover, the native Orient queries using outE('some label') or the Tinkerpop 2.x Vertex/Edge APIs or similar work quite well as only the relevant edges on the vertex are looked up. We certainly don't want to look up "all" edges on a vertex and perform a post-filtering step based on the label.
Actual behavior
Using an edge label in the Multi-model API that does not have a corresponding edge class is not supported (as in Tinkerpop 2.x. )
Steps to reproduce
Create an Edge between vertices (using the multi-model API) using a custom label but without creating an edge class for the label first. Orient requires an edge class and attempts to create a class on the fly if missing. If we're in the middle of a transaction we also get an error message. As far as I can tell, there is no corresponding mechanism in the 3.x multi-model API re. dynamic edge labels to mimic what could be easily done in the Tinkerpop 2.x integration.