Closed mrft closed 4 years ago
Is this a benchmark? Have you tried with 1 or 100 records?
How many records were into the index ? Are you able to provide data of benchmark ?
Did you make cache hot before benchmark ?
It is just the GratefulDeadConcerts database.
Doing the same query a few times from the console seems faster: Query executed in 0.005 sec(s).
But still 5 milliseconds for 1 single record is extremely slow in my opinion, don't you think?
Here is some tests using a node script that calls the restful API and uses Oriento:
1000 X query [select from V where name = 'HEY BO DIDDLEY'] through REST took 8354 milliseconds. That's 119.70313622216902 per second, and 8.354 ms per event. Start 1000 queries with BINARY protocol and query = select from V where name = 'HEY BO DIDDLEY' 1000 X query [select from V where name = 'HEY BO DIDDLEY'] through BINARY took 3222 milliseconds. That's 310.36623215394167 per second, and 3.222 ms per event.
Can you try to get 1 and 50 records? I bet the time is pretty close. If you get 1 record you've a fixed cost in time.
WONTFIX + INVALID ?
Fixed cost in time? Then can you please explain these results to me? Or tell me what I am doing wrong?
Some more results from SNAPSHOT-2.0 from last tuesday I think...
1000 X query [select from V where name <> 'HEY BO DIDDLEY' limit 50] through REST took 205901 milliseconds. That's 4.856702978615937 per second, and 205.901 ms per event. 1000 X query [select from V where name <> 'HEY BO DIDDLEY' limit 50] through BINARY took 216148 milliseconds. That's 4.62645964801895 per second, and 216.148 ms per event.
1000 X query [select from V where name in [ 'HEY BO DIDDLEY','IM A MAN','NOT FADE AWAY','BERTHA','GOING DOWN THE ROAD FEELING BAD','MONA','Bo_Diddley','Garcia','Spencer_Davis' ]] through REST took 51524 milliseconds. That's 19.408431022436147 per second, and 51.524 ms per event.
And can it get any more basic than this?
1000 X query [select from V limit 50] through REST took 191517 milliseconds. That's 5.221468590255696 per second, and 191.517 ms per event.
@mrft I closed the issue because you didn't respond at my first question about executing the query with 1 and 50 returning items.
@mrft In your last comment you don't write how you tested REST and Binary protocols. Can you execute just 2 queries through console or CURL that retrieve 1 and then 50 items?
Using console
connect remote:localhost/GratefulDeadConcerts admin admin
select from V limit 50
(I did it > 5 times to let the memory caching kick in):
50 item(s) found. Query executed in 0.124 sec(s).
Ok, so it's 0,00248 per record.
About the index, have you used a UNIQUE_HASH_INDEX?
NOTUNIQUE as you can see in my first post (since not all V in GratefulDeadConcerts have a name I think so you have multiple NULL values...)
2.48 ms per record, sounds ok if you put it like that, but it should be 2.48 ms for the 50 small records (or maybe 5ms or even 10ms but not 124ms). Try a similar query on MySQL on Postgres, and you'll know what I mean.
Do you knwo what is causing this? ToJSON couldn't take that long...
Guys it is not even 2 ms per record we can get about 5 000 records per second which is 0.2 ms per record. I will try to provide Java based benchmark at first, I suppose no one use console as interface to interact with application. And then we will discuss it's results.
Ok so for one test I got 100 rec/ms . 0.01 ms for record. I will check it on this database.
Some more tests, this time I wrote a very simple java application, and started fetching all the records
public static void main( String[ ] args) { for ( int i = 0; i < 50; i++ ) { testDocument_GratefulDeadConcerts(); } } public static void testDocument_GratefulDeadConcerts() { ODatabaseDocumentTx db = new ODatabaseDocumentTx("remote:localhost/GratefulDeadConcerts").open( "admin", "admin" ); String className = "V"; int clid = db.getClusterIdByName( "V" ); System.out.println( "clusterid = " + clid ); int nrOfRecords = 0; long t = System.nanoTime(); long totalFetchTime = 0; long totalToJSONTime = 0; try { for ( ODocument record : db.browseCluster( className ) ) { nrOfRecords++; long currentNanoTime = System.nanoTime(); totalFetchTime += currentNanoTime - t; t = System.nanoTime(); String json = record.toJSON(); currentNanoTime = System.nanoTime(); totalToJSONTime += currentNanoTime - t; t = System.nanoTime(); } double totalFetchTimeInMilliseconds = ((double)totalFetchTime / 1000000); double totalToJSONTimeInMilliseconds = ((double)totalToJSONTime / 1000000); System.out.println( "Fetching and toJSON() on " + nrOfRecords + " records took " + totalFetchTimeInMilliseconds + " milliseconds and " + totalToJSONTimeInMilliseconds + " milliseconds respectively." ); System.out.println( "That is " + (totalFetchTimeInMilliseconds / nrOfRecords) + " milliseconds and " + ( totalToJSONTimeInMilliseconds / nrOfRecords ) + " milliseconds per record" + "\n" + " and " + ( 1000.0 / (totalFetchTimeInMilliseconds / nrOfRecords) ) + " records per second" + " and " + (1000.0 / (totalToJSONTimeInMilliseconds / nrOfRecords)) + " records per second respectively." ); db.rollback(); } finally { System.out.println( "Number of records = " + nrOfRecords ); db.close(); } } public static long printDurationToSystemOut(long startSystemNanoTime, long currentSystemNanoTime, String description) { long durationInNanoSeconds = currentSystemNanoTime - startSystemNanoTime; double durationInMilliseconds = (double)durationInNanoSeconds / 1000000; System.out.println(description + " took " + durationInMilliseconds + " ms."); return durationInNanoSeconds; }
gives me the following disappointing final results on the first run
Fetching and toJSON() on 809 records took 544.10086 milliseconds and 1318.839576 milliseconds respectively. That is 0.6725597775030903 milliseconds and 1.630209611866502 milliseconds per record and 1486.8566831524581 records per second and 613.4180492624222 records per second respectively.
which is slow, an I don't understand because the server keeps running so I would presume the 'cache' should still be hot (or is it completely thrown away on each 'session'?) and could that be the reason why everything on the Restful API stays so slow, because it is like a whole new 'cold' session each time?
On the last run, I get closer to the 5 000 reords per second that laa is talking about:
Fetching and toJSON() on 809 records took 143.26815 milliseconds and 175.4015 milliseconds respectively. That is 0.17709289245982693 milliseconds and 0.21681273176761434 milliseconds per record and 5646.754006385928 records per second and 4612.275265604912 records per second respectively.
but mind you that I am not even doing a query, just selecting ALL records in the database. Also toJSON() seems very slow, if you see that it can only stringify < 5000 records per second, and considering stringifying a larger object in nodejs can do 175438 javascript objects per second. I can imagine it will be slower, but 35 times slower seems like way too much, so I guess some performance improvements could and should be made there.
And the following are just some simple JSON parsing and stringifying tests on nodejs, just to get an idea about what these figures mean...
Parsing 10000 JSON objects took 244 milliseconds. That's 40983.60655737705 per second, and 0.0244 ms per event. Stringifying 10000 JSON objects took 57 milliseconds. That's 175438.59649122806 per second, and 0.0057 ms per event.
@mrft curious about this because Oriento has some benchmarks as part of its test suite and I've never seen such low numbers, I wonder if this is because we're using RECORD_LOAD and thus avoid all the SQL overhead. Would you mind running the Oriento tests and pasting the results of this one in particular: https://github.com/codemix/oriento/blob/master/test/bugs/27-slow.js ?
@mrft you load all records by network on client side to do query and claim that this is slow ? Simple does not mean efficient.
All queries are done on server side not on client side. Also System.nanoTime has:
https://blogs.oracle.com/dholmes/entry/inside_the_hotspot_vm_clocks
What I propose. I will write benchmark and you will review it. Then you will run benchmark. And then we will decide whether this issue should be fixed or not.
BTW Which H/W did you use to run this your test ?
@phpnode: I guess these are the results you are looking for:
Bug #27: Slow compared to Restful API Binary Protocol Took 556ms, 8993 documents per second ✓ should load a lot of records quickly, using the binary raw command interface (557ms) Rest Protocol Took 593ms, 8432 documents per second ✓ should load a lot of records quickly, using the rest raw command interface (593ms) Binary DB Api Took 292ms, 17123 documents per second ✓ should load a lot of records quickly (292ms) Binary Record Load Took 985ms, 5076 documents per second ✓ should load a lot of records, one at a time, using binary (986ms) Rest Record Load Took 7575ms, 660 documents per second ✓ should load a lot of records, one at a time, using rest (7575ms)
@laa:
Query 1 of 1, Rows read: 1,000, Elapsed time (seconds) - Total: 0.088, SQL query: 0.005, Reading results: 0.083 Query 1 of 1, Rows read: 1,000, Elapsed time (seconds) - Total: 0.067, SQL query: 0.009, Reading results: 0.058 Query 1 of 1, Rows read: 1,000, Elapsed time (seconds) - Total: 0.041, SQL query: 0.004, Reading results: 0.037 Query 1 of 1, Rows read: 1,000, Elapsed time (seconds) - Total: 0.032, SQL query: 0.005, Reading results: 0.027 Query 1 of 1, Rows read: 1,000, Elapsed time (seconds) - Total: 0.028, SQL query: 0.003, Reading results: 0.025
and if I compare this to 0.544 or later on 0.143 for fetching about 800 records from OrientDB, I can only conclude that it is way slower. About 5 times worse, that is very signifcant. All of this was done on the same laptop, with no phyical network between client and server.
Rows read: 1,000, Elapsed time (seconds) - Total: 0.03, SQL query: 0.004, Reading results: 0.026
Since I was trying to measure how long it took to execute something, maybe you should take a look at what I quoted below from https://docs.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29
For example, to measure how long some code takes to execute: long startTime = System.nanoTime(); // ... the code being measured ... long estimatedTime = System.nanoTime() - startTime;
What more can I say? It feels like you don't want to believe these numbers, because you don't like what I am saying.
You can try to write a benchmark that will prove that my numbers are wrong, or you could use your time wisely and make the restful API, ànd the console, ànd the Java API all work faster, and check if you are on par with other modern day database systems.
@mrft I have doubts about benchmarks because you are not only one user who compares DBs performance. Could you wait a bit I finish important issue today and we try to discover what is going on.
@mrft thanks, so the Oriento results don't reflect what you're seeing in java. What kind of perf do you see if you do the following in java (pseudo-code)
// prep
var num = 10000;
for (i = 0; i < num; i++) {
db.someCluster.insert({foo: i});
}
// bench
var startTime = System.nanoTime();
for (i = 0; i < num; i++) {
doc = db.record.load('#10:' + i);
}
var stopTime = System.nanoTime();
console.log(stopTime - startTime);
@phpnode
This is my code, so please correct me if you meant something else (or if it can be done more efficiently)
ODatabaseDocumentTx db = new ODatabaseDocumentTx( "remote:localhost/GratefulDeadConcerts" ).open( "admin", "admin" ); try { // prep int num = 10000; long startTime = System.nanoTime(); ODocument doc = null; for ( int i = 0; i < num; i++ ) { doc = new ODocument( "Test" ); doc.field( "foo", i ); // SAVE THE DOCUMENT // StringBuilder sb = new StringBuilder(); doc.save(); // .getIdentity().toString( sb ); // System.out.println( sb.toString() ); } long stopTime = System.nanoTime(); Utils.printDurationToSystemOut( startTime, stopTime, num, "Creating and saving document " ); // bench String clusterId = Integer.toString( doc.getIdentity().getClusterId() ); startTime = System.nanoTime(); for ( int i = 0; i < num; i++ ) { Object loadedDoc = db.command( new OSQLSynchQuery( "select from " + "#" + clusterId + ":" + i ) ).execute(); } stopTime = System.nanoTime(); long nrOfNanoSecondsPassed = stopTime - startTime; double nrOfMilliSecondsPassed = (double)nrOfNanoSecondsPassed / 1000000; Utils.printDurationToSystemOut( startTime, stopTime, num, "Getting document " ); System.out.println( nrOfNanoSecondsPassed ); } finally { db.rollback(); db.close(); }
Will return:
Creating and saving document 10000 times took 5536.996626 ms. That's 1806.0332478880582 per second, and 0.5536996626 ms per event. Getting document 10000 times took 3319.131759 ms. That's 3012.8361047688077 per second, and 0.33191317589999997 ms per event. 3319131759
One other question: you say Oriento results don't reflect what I am seeing in Java, but the only test returning a reasonable number of records per second seems to be 'Binary DB Api Took 292ms, 17123 documents per second'. I don't really know what the difference between all the tests is, and what exactly you are doing (I could check your code, but I haven't yet :) ).
@mrft I recreated your bench using .NET client connected to remote localhost OrientDB.
using (var db = CreateTestDb(MethodInfo.GetCurrentMethod()))
{
db.RegisterAllTypes();
var num = 10000;
var sw = new Stopwatch();
sw.Start();
Actor actor = null;
for (int i = 0; i < num; i++)
{
actor = new Actor() { Age = i };
actor.Save();
}
sw.Stop();
Debug.WriteLine("Creating and saving document {0} times took {1} ms. That's {2} per second, and {3} ms per event.", num, sw.ElapsedMilliseconds, (num * 1000) / (sw.ElapsedMilliseconds), (double)sw.ElapsedMilliseconds / num);
sw.Restart();
for (int i = 0; i < num; i++)
{
var act = db.LoadRecord<Actor>(new ORID(actor.RID.ClusterId, i));
Assert.IsTrue(act != null);
}
sw.Stop();
Debug.WriteLine("Getting document {0} times took {1} ms. That's {2} per second, and {3} ms per event.", num, sw.ElapsedMilliseconds, (num * 1000) / (sw.ElapsedMilliseconds), (double)sw.ElapsedMilliseconds / num);
}
Results on my 6-core home pc are following
Creating and saving document 10000 times took 1762 ms. That's 5675 per second, and 0.1762 ms per event.
Getting document 10000 times took 1961 ms. That's 5099 per second, and 0.1961 ms per event.
My two cents to the discussion:
@mrft in the fastest example, it is simply loading 5000 records in one go via a single select query
Ok. Here my test.
@Test
public class IndexTest {
public volatile long c = 43;
public void testIndexQueryEmbedded() throws Exception {
ODatabaseDocumentTx databaseDocumentTx = new ODatabaseDocumentTx("plocal:/home/andrey/Development/releases/orientdb-community-2.0-SNAPSHOT/databases/GratefulDeadConcerts");
databaseDocumentTx.open("admin", "admin");
int counter = 0;
// make JIT do it's job
for (int i = 0; i < 1000000; i++) {
List<ODocument> result = databaseDocumentTx
.query(new OSQLSynchQuery<ODocument>("select from V where name = 'HEY BO DIDDLEY'"));
counter += result.size();
databaseDocumentTx.getLocalCache().clear();
}
long st = System.currentTimeMillis();
// make pause, consume CPU, we need to have volatile here to do not circuit cycle
for (int k = 0; k < 500; k++) {
long c1 = c;
for (int i = 0; i < 100000000; i++) {
c1 += c1 * 31 + i * 51;
}
c = c1;
}
System.out.println("Pause : " + (System.currentTimeMillis() - st + " ms."));
// benchmark
long start = System.nanoTime();
for (int i = 0; i < 100; i++) {
List<ODocument> result = databaseDocumentTx
.query(new OSQLSynchQuery<ODocument>("select from V where name = 'HEY BO DIDDLEY'"));
counter += result.size();
databaseDocumentTx.getLocalCache().clear();
}
long end = System.nanoTime();
System.out.println("Time : " + ((end - start) / (100)) + " ns. counter : " + counter);
}
}
for embedded storage I have:
Pause : 68921 ms. Time : 29142 ns. ( 0.03 ms for query)
For remote:
Time : 97171 ns (0.1 ms for query)
Do you agree with this test ?
Ok the same for record loading.
Remote: 103910 ns. Embeded: Time : 9721 ns
As for me we should improve remote protocol performance but are those number super slow ?
Hi,
did some more tests using your code. Seems better but I have some questions:
What exactly are these things doing? // make JIT do it's job // make pause, consume CPU, we need to have volatile here to do not circuit cycle
and why are they needed to get a decent speed?
It would be nice to have a good speed from the start (or after just a few queries) instead of having to do funny things like running the same query a gazillion times first, and then saturating the CPU for a minute. My results using oriento seem better now too.
So I am now down to
Running select from V where name = 'HEY BO DIDDLEY' 1000 times took 792.975994 ms. That's 1261.0722235810836 per second, and 0.792975994 ms per event.
Also tried some other queries, and one thing that struck me (and I compare with similar Postgres queries just to have a reference point) is the following: when the number of results goes up, the query time seems to go up linearly in orientdb, whereas it looks logarithmic for Postgres.
Running select * from V where type = 'song' limit 100 1000 times took 7156.845744 ms. That's 139.7263593166526 per second, and 7.156845744 ms per event. Running select * from V where type = 'song' limit 200 1000 times took 16034.44563 ms. That's 62.365735808728424 per second, and 16.03444563 ms per event.
Compare this to a query that is limted to 200 and then to 2000 results on postgres. As you can see the total time less than triples, as the resultset grows by a factor of 10.
Query 1 of 1, Rows read: 200, Elapsed time (seconds) - Total: 0.004, SQL query: 0.003, Reading results: 0.001 Query 1 of 1, Rows read: 2,000, Elapsed time (seconds) - Total: 0.011, SQL query: 0.006, Reading results: 0.005
I don't know what this means but it suggests that fetching the records is an expensive operation?
So the remaining questions are Can something be done about this 'it only works fast if you prepare it in exactly the right way' ? and Can you examine whether there is a performance problem with fetching large resultsets? (which is what seems to be the case over here).
@mrft Are you just executing the query or do you also browse the result set one by one and access to the fields?
@lvca
I am not accessing the fields, I pretty much exactly copied laa's code. Since the result is a List
@laa I have another question, if I run your 'preparation'-code only once, can I run only the benchmark (latest loop in your code) for any query afterwards, and will I get a realistic figure for the performance ? (or maybe leave out the first 3-5 runs because they tend to be a bit slower, but they wont influence the average that much if you do 1000 runs). [If I have to run every single query 100000 times before it starts performing well, that would be unacceptable of course]
I am asking because then we'd have an understanding, and I could start trying out a lot of different queries that I think are relevant, and we can check if the speed makes sense (for example: select distinct(type) from V - after adding the relevant index - also seems slow).
Because until now, what we have been doing feels like 'alter the test so the function passes', whereas I feel that one should 'write the test, then make the function pass'. I also feel that it's not up to the user of Orientdb to prep the server to be performant, but the server itself should make sure that it can perform at full speed shortly after it has started (a few seconds up to a few minutes but not longer than that).
Another strange thing: After a while I get this performance (4ms) in the console for select distinct( type ) from V (after creating an index on type)
----+------+-------- # |@CLASS|distinct ----+------+-------- 0 |null |song 1 |null |artist ----+------+-------- 2 item(s) found. Query executed in 0.004 sec(s).
But when running this same query using laa's benchmark code, I keep getting an average of about 30ms. Strange, why would laa's code be so much slower than the console for this query ( with and without databaseDocumentTx.getLocalCache().clear(); )?
Start the benchmark with 1000 queries (select distinct( type ) from V ). Time : 298285221 ns. counter : 2000 Running this query (returning 2 results) 1000 times took 29828.522138 ms. That's 33.52495961327067 per second, and 29.828522138 ms per event.
Same results (30ms) for oriento in nodejs...
(I get an average of about 2ms on Postgres, which is similar to the results in the console...)
No reason to omptimize remote storage interaction till new authentication model will be provided.
Purely informational, some new benchmark results using v2.0.
First without indexes, and then with indexes (notunique) on name, type and song_type columns
"select distinct( type ) from V" doesn't seem to use the index and needs a full table scan to find all possible values, which is slow of course.
explain select distinct( type ) from V
Profiled command '{optimizationElapsed:0,current:#9:808,evaluated:809,fetchingFromTargetElapsed:21,documentReads:809,user:#5:0,documentAnalyzedCompatibleClass:809,recordReads:809,elapsed:21.1337,resultType:collection,resultSize:2}' in 0.023000 sec(s): {"@type":"d","@version":0,"optimizationElapsed":0,"current":"#9:808","evaluated":809,"fetchingFromTargetElapsed":21,"documentReads":809,"user":"#5:0","documentAnalyzedCompatibleClass":809,"recordReads":809,"elapsed":21.1337,"resultType":"collection","resultSize":2,"@fieldTypes":"optimizationElapsed=l,current=x,evaluated=l,fetchingFromTargetElapsed=l,documentReads=l,user=x,documentAnalyzedCompatibleClass=l,recordReads=l,elapsed=f"}
Start the benchmark with 1000 queries (select from V where name = 'STANDER ON THE MOUNTAIN' ). Time : 113533730 ns. counter : 1000 Running this query (returning 1 results) 1000 times took 11353.373027 ms. That's 88.07955112739202 per second, and 11.353373027 ms per event.
Start the benchmark with 1000 queries (select from V where name = 'HEY BO DIDDLEY' ). Time : 113843996 ns. counter : 1000 Running this query (returning 1 results) 1000 times took 11384.399627 ms. That's 87.83950254419507 per second, and 11.384399627 ms per event.
Start the benchmark with 1000 queries (select * from V where song_type = 'original' ). Time : 126442780 ns. counter : 184000 Running this query (returning 184 results) 1000 times took 12644.278037 ms. That's 79.08715682095689 per second, and 12.644278037 ms per event.
Start the benchmark with 1000 queries (select * from V where song_type = 'original' limit 100 ). Time : 53345030 ns. counter : 100000 Running this query (returning 100 results) 1000 times took 5334.50304 ms. That's 187.4588865170091 per second, and 5.3345030399999995 ms per event.
Start the benchmark with 1000 queries (select * from V where type = 'song' limit 100 ). Time : 34360785 ns. counter : 100000 Running this query (returning 100 results) 1000 times took 3436.078525 ms. That's 291.0294374020454 per second, and 3.436078525 ms per event.
Start the benchmark with 1000 queries (select * from V where type = 'song' limit 500 ). Time : 122897201 ns. counter : 500000 Running this query (returning 500 results) 1000 times took 12289.720153 ms. That's 81.36881780468318 per second, and 12.289720153000001 ms per event.
Start the benchmark with 1000 queries (select name from V where type = 'song' limit 100 ). Time : 48835848 ns. counter : 100000 Running this query (returning 100 results) 1000 times took 4883.584849 ms. That's 204.76761045828204 per second, and 4.883584849 ms per event.
Start the benchmark with 1000 queries (select name from V where type = 'song' limit 500 ). Time : 176030408 ns. counter : 500000 Running this query (returning 500 results) 1000 times took 17603.040805 ms. That's 56.808366865567805 per second, and 17.603040805 ms per event.
Start the benchmark with 1000 queries (select * from V where type = 'song' ). Time : 141386616 ns. counter : 584000 Running this query (returning 584 results) 1000 times took 14138.661688 ms. That's 70.72805206512132 per second, and 14.138661688000001 ms per event.
Start the benchmark with 1000 queries (select * from V where type = 'artist' limit 100 ). Time : 86256039 ns. counter : 100000 Running this query (returning 100 results) 1000 times took 8625.60399 ms. That's 115.93391038579317 per second, and 8.62560399 ms per event.
Start the benchmark with 1000 queries (select * from V where type = 'artist' limit 200 ). Time : 130524879 ns. counter : 200000 Running this query (returning 200 results) 1000 times took 13052.487977 ms. That's 76.61374611201451 per second, and 13.052487977 ms per event.
Start the benchmark with 1000 queries (select * from V where type = 'artist' ). Time : 139536159 ns. counter : 224000 Running this query (returning 224 results) 1000 times took 13953.615931 ms. That's 71.66601151593643 per second, and 13.953615931 ms per event.
Start the benchmark with 1000 queries (select distinct( type ) from V ). Time : 138131889 ns. counter : 2000 Running this query (returning 2 results) 1000 times took 13813.18893 ms. That's 72.39457920018474 per second, and 13.81318893 ms per event.
Start the benchmark with 1000 queries (select from V where name = 'STANDER ON THE MOUNTAIN' ). Time : 6926333 ns. counter : 1000 Running this query (returning 1 results) 1000 times took 692.633384 ms. That's 1443.7652343941886 per second, and 0.692633384 ms per event.
Start the benchmark with 1000 queries (select from V where name = 'HEY BO DIDDLEY' ). Time : 9111726 ns. counter : 1000 Running this query (returning 1 results) 1000 times took 911.172603 ms. That's 1097.486905013978 per second, and 0.911172603 ms per event.
Start the benchmark with 1000 queries (select * from V where song_type = 'original' ). Time : 55548184 ns. counter : 184000 Running this query (returning 184 results) 1000 times took 5554.818462 ms. That's 180.02388500018634 per second, and 5.554818462 ms per event.
Start the benchmark with 1000 queries (select * from V where song_type = 'original' limit 100 ). Time : 38716364 ns. counter : 100000 Running this query (returning 100 results) 1000 times took 3871.636488 ms. That's 258.28871153050255 per second, and 3.871636488 ms per event.
Start the benchmark with 1000 queries (select * from V where type = 'song' limit 100 ). Time : 38406157 ns. counter : 100000 Running this query (returning 100 results) 1000 times took 3840.615723 ms. That's 260.37491697265546 per second, and 3.840615723 ms per event.
Start the benchmark with 1000 queries (select * from V where type = 'song' limit 500 ). Time : 105795004 ns. counter : 500000 Running this query (returning 500 results) 1000 times took 10579.500426 ms. That's 94.5224216393448 per second, and 10.579500426000001 ms per event.
Start the benchmark with 1000 queries (select name from V where type = 'song' limit 100 ). Time : 47336915 ns. counter : 100000 Running this query (returning 100 results) 1000 times took 4733.691577 ms. That's 211.25161699566303 per second, and 4.733691576999999 ms per event.
Start the benchmark with 1000 queries (select name from V where type = 'song' limit 500 ). Time : 153366976 ns. counter : 500000 Running this query (returning 500 results) 1000 times took 15336.697631 ms. That's 65.20308504868117 per second, and 15.336697631 ms per event.
Start the benchmark with 1000 queries (select * from V where type = 'song' ). Time : 113562996 ns. counter : 584000 Running this query (returning 584 results) 1000 times took 11356.299655 ms. That's 88.0568521771716 per second, and 11.356299655 ms per event.
Start the benchmark with 1000 queries (select * from V where type = 'artist' limit 100 ). Time : 28642008 ns. counter : 100000 Running this query (returning 100 results) 1000 times took 2864.200856 ms. That's 349.13752570989385 per second, and 2.864200856 ms per event.
Start the benchmark with 1000 queries (select * from V where type = 'artist' limit 200 ). Time : 43286484 ns. counter : 200000 Running this query (returning 200 results) 1000 times took 4328.648443 ms. That's 231.01899199440254 per second, and 4.328648443 ms per event.
Start the benchmark with 1000 queries (select * from V where type = 'artist' ). Time : 47580440 ns. counter : 224000 Running this query (returning 224 results) 1000 times took 4758.044031 ms. That's 210.17039638236167 per second, and 4.758044031000001 ms per event.
Start the benchmark with 1000 queries (select distinct( type ) from V ). Time : 150766144 ns. counter : 2000 Running this query (returning 2 results) 1000 times took 15076.61445 ms. That's 66.32788835427174 per second, and 15.07661445 ms per event.
Some more test results on OrientDB version 2.0.5.
For the last query select name, $sungbyrecord as x_sungby from V let $sungby = outE()[ @class = 'sung_by' ], $sungbyrecord = list( (select in.name from $sungby ) ) where name='HEY BO DIDDLEY', the difference between with and without indexes is incredible, which I find peculiar, since traversing edges is being advertised as have a very low cost, so you would expect this time to always (indexes or not) be very close to a simple select name from V where name='HEY BO DIDDLEY' but it takes 10 times as long without indexes, and about the same time with indexes.
query | # rows /query | # queries /sec | # rows /sec | ms /query | ms /row | # queries | totalTime (ms) |
---|---|---|---|---|---|---|---|
select * from V where song_type = 'original' | 184 | 71.441 | 13145.172 | 13.998 | 0.076 | 1000 | 13997.535 |
select * from V where song_type = 'original' limit 100 | 100 | 165.990 | 16598.958 | 6.024 | 0.060 | 1000 | 6024.475 |
select * from V where type = 'song' limit 100 | 100 | 257.518 | 25751.799 | 3.883 | 0.039 | 1000 | 3883.224 |
select * from V where type = 'song' limit 500 | 500 | 77.580 | 38790.167 | 12.890 | 0.026 | 1000 | 12889.865 |
select name from V where type = 'song' limit 100 | 100 | 215.258 | 21525.838 | 4.646 | 0.046 | 1000 | 4645.580 |
select name from V where type = 'song' limit 500 | 500 | 57.107 | 28553.438 | 17.511 | 0.035 | 1000 | 17511.026 |
select * from V where type = 'song' | 584 | 68.734 | 40140.737 | 14.549 | 0.025 | 1000 | 14548.811 |
select * from V where type = 'artist' limit 100 | 100 | 109.713 | 10971.268 | 9.115 | 0.091 | 1000 | 9114.717 |
select * from V where type = 'artist' limit 200 | 200 | 75.247 | 15049.494 | 13.289 | 0.066 | 1000 | 13289.484 |
select * from V where type = 'artist' | 224 | 70.668 | 15829.675 | 14.151 | 0.063 | 1000 | 14150.638 |
select distinct( type ) from V | 2 | 69.532 | 139.063 | 14.382 | 7.191 | 1000 | 14381.942 |
select name, $sungbyrecord as x_sungby from V let $sungby = outE()[ @class = 'sung_by' ], $sungbyrecord = list( (select in.name from $sungby ) ) where name='HEY BO DIDDLEY' | 1 | 7.289 | 7.289 | 137.198 | 137.198 | 1000 | 137198.168 |
query | # rows /query | # queries /sec | # rows /sec | ms /query | ms /row | # queries | totalTime (ms) |
---|---|---|---|---|---|---|---|
select from V where name = 'STANDER ON THE MOUNTAIN' | 1 | 998.787 | 998.787 | 1.001 | 1.001 | 1000 | 1001.215 |
select from V where name = 'HEY BO DIDDLEY' | 1 | 1533.218 | 1533.218 | 0.652 | 0.652 | 1000 | 652.223 |
select * from V where song_type = 'original' | 184 | 192.444 | 35409.669 | 5.196 | 0.028 | 1000 | 5196.321 |
select * from V where song_type = 'original' limit 100 | 100 | 267.173 | 26717.287 | 3.743 | 0.037 | 1000 | 3742.895 |
select * from V where type = 'song' limit 100 | 100 | 267.646 | 26764.556 | 3.736 | 0.037 | 1000 | 3736.285 |
select * from V where type = 'song' limit 500 | 500 | 99.432 | 49716.195 | 10.057 | 0.020 | 1000 | 10057.085 |
select name from V where type = 'song' limit 100 | 100 | 228.666 | 22866.639 | 4.373 | 0.044 | 1000 | 4373.183 |
select name from V where type = 'song' limit 500 | 500 | 66.468 | 33234.237 | 15.045 | 0.030 | 1000 | 15044.726 |
select * from V where type = 'song' | 584 | 85.948 | 50193.442 | 11.635 | 0.020 | 1000 | 11634.986 |
select * from V where type = 'artist' limit 100 | 100 | 353.443 | 35344.267 | 2.829 | 0.028 | 1000 | 2829.313 |
select * from V where type = 'artist' limit 200 | 200 | 226.790 | 45358.027 | 4.409 | 0.022 | 1000 | 4409.363 |
select * from V where type = 'artist' | 224 | 200.577 | 44929.252 | 4.986 | 0.022 | 1000 | 4985.616 |
select distinct( type ) from V | 2 | 64.084 | 128.168 | 15.605 | 7.802 | 1000 | 15604.579 |
select name, $sungbyrecord as x_sungby from V let $sungby = outE()[ @class = 'sung_by' ], $sungbyrecord = list( (select in.name from $sungby ) ) where name='HEY BO DIDDLEY' | 1 | 1006.478 | 1006.478 | 0.994 | 0.994 | 1000 | 993.564 |
try without the subselect. Ie..
select name, $sungbyrecord, $sungby.in.name from V let $sungby = outE()[ @class = 'sung_by' ] where name='HEY BO DIDDLEY'
Sent from my Verizon Wireless 4G LTE smartphone
That's a different query.
What you propose would be similar to
select name, $followedby.in.name, $followedby.in.performances from V let $followedby = outE()[ @class = 'followed_by' ] where name='HEY BO DIDDLEY'
which returns 2 lists, 1 for name, and one for performances
which is a completely different thing than
select name, $followedbyrecord as followedbylist from V let $followedby = outE()[ @class = 'followed_by' ], $followedbyrecord = list( (select in.name as name, in.performances as performances from $followedby) ) where name='HEY BO DIDDLEY'
which returns 1 list with objects containing 2 properties.
I am not looking for alternative queries, I am trying to let the developers know where improvements could be made.
Another performance related question from someone else: https://groups.google.com/forum/?pli=1#!topic/orient-database/LpG4kQSyqc8
This issue is not specific , it is more like there are list of queries which are slow, there is no concrete query it is like aggregation of slow queries so I remove milestone from this issue.
Also problem lies not in embedded version but n remote version so I assign @tglman to this issue.
Hi, if this issue still actual please let us know otherwise we will close it.
=> Query executed in 0.069 sec. Returned 1 record(s)
returns
which indicates the index is being used.
69 milliseconds, for fetching 1 single record, you could beat that performnce by a serial search. This is a severe contrast with OrientDB boasting about its speed on the website...
Oh, and I am running OrientDB locally (so no network latency), and ran this query through the studio (but that doesn't really matter). Version 1.7.9 or 2.0-M2 doesn't matter also