Closed zachkinstner closed 11 years ago
For each Factor, the following tasks need to occur (in roughly this order):
Upon each Artifact "verify" step, if the Artifact does not exist, then all the remaining tasks in the sequence should be skipped, and the entire sequence should return an error/value that indicates the issue.
The original implementation did this:
Notes:
I'm going to invent a theoretical RexConnect feature, which might help solve this problem: conditional command execution. It would work by checking the response of a specified command for a value of false, zero, or null.
New implementation idea (skipping optional edges for now):
Cmd ID | Cond? | Name | Pseudo-Query |
---|---|---|---|
Mem0 | Get Mem Once | m=g.V('MId',_P0).next(); |
|
F0.0 | Get Prim Art | pa=g.V('AId',_P0); if(pa){ pa=pa.next(); }; pa; |
|
F0.1 | F0.0 | Get Rel Art | ra=g.V('AId',_P0); if(ra){ ra=ra.next(); }; ra; |
F0.2 | F0.1 | Add Factor | f=g.addVertex([...]); f.id; |
F0.3 | F0.1 | Add Mem Edge | ...VCI...; g.addEdge(m,f,'Creates',[...VCI...]); |
F0.4 | F0.1 | Add Prim Edge | ...VCI...; g.addEdge(f,pa,'UsesPrimary',[...VCI...]); |
F0.5 | F0.1 | Add Rel Edge | ...VCI...; g.addEdge(f,ra,'UsesRelated',[...VCI...]); |
Notes:
Fx.y
pattern increases x
with each Factor, and y
with each command.F0.2
above).This could be done without a new RexConnect feature:
pass=true;
pass=false;
if(pass){ return null; };
This is solution is not very elegant, and it adds an extra command for each Factor. If an Artifact is missing, then this solution still executes several extra commands that will all return null (a minor point).
The query for adding a Factor can change lengths due to many optional parameters. This means re-compilation of many similar queries.
To avoid this, create a property map over several commands:
props
with all mandatory Factor propertiesprops
props
g.addVertex(props);
This approach minimizes the number of possible (unique, parameterized) query scripts, and makes those queries much shorter. Thus, those queries are faster to compile, and take less memory to cache. Note that only the optional properties are added individually. The mandatory properties are added in bunches (since the order/quantity doesn't change).
This is also a great test case for RexConnect vs. RexProClient performance.
These BatchCreateFactor improvements would be simpler to implement with RexProClient, since each query is executed individually. The logic can occur entirely in the application code (instead of needing conditional RexConnect commands, etc.), so if an Artifact isn't found, the application code can simply set the error response and move on.
The assumed downside to that approach is the quantity of round-trips to the database, and the de/serialization that occurs each time. The performance tests would determine its actual impact.
The integration test has ~300 factors, with 20 factors in each request, and 3-degree parallelism. It executes in 2.5 to 3.0 seconds. I'm curious to see the performance impact on the production environment.
The
BatchCreateFactor
function needs to be improved. Accomplishing this will not be trivial, and may require RexConnect and/or Weaver changes, so I'm creating this issue to keep things organized.