Just wondering: Is the use of sqlite a performance issue?

gitttt commented 8 years ago

Given the tracing context, do you think writing to sqlite is slower than writing to a file? Is the db access the bottleneck or the instrumentation code? Did you experiment with postgresql?

doegox commented 8 years ago

We first experimented with MongoDB and this was a disaster :)

Testing a full trace on Karroumi binary:

TracerPIN producing Sqlite (it commits to the DB in real time, there is no buffering):
- 38s, 982Mb
TracerPIN producing human-readable file:
- 23s, 1.3Gb
TracerGrind priducing its own binary format:
- 4.5s, 320Mb
sqlitetrace converting TracerGrind file into Sqlite:
- 37s, 910Mb
texttrace converting TracerGrind file into human-readable file:
- 10s, 870Mb

It would be interesting to see if it's worth adding a postgresql backend, but I like keeping sqlite as alternative anyway, much more handy to deal just with a file.

gitttt commented 8 years ago

Did you try to optimize sqlite, e.g. Like this http://www.sqlite.org/faq.html#q19 or with any other tricks?

doegox commented 8 years ago

No, thanks for pointing it out! But about the numbers I gave above, I'm not sure it would improve them: I'm using a SSD, no rotation of disk platter to wait ;)

gitttt commented 8 years ago

https://www.sqlite.org/speed.html They say the doc is outdared but nevertheless the not syncing and the transactions may improve speed a lot. May. ;-)

Test 1: 1000 INSERTs

CREATE TABLE t1(a INTEGER, b INTEGER, c VARCHAR(100)); INSERT INTO t1 VALUES(1,13153,'thirteen thousand one hundred fifty three'); INSERT INTO t1 VALUES(2,75560,'seventy five thousand five hundred sixty'); ... 995 lines omitted INSERT INTO t1 VALUES(998,66289,'sixty six thousand two hundred eighty nine'); INSERT INTO t1 VALUES(999,24322,'twenty four thousand three hundred twenty two'); INSERT INTO t1 VALUES(1000,94142,'ninety four thousand one hundred forty two'); PostgreSQL: 4.373 MySQL: 0.114 SQLite 2.7.6: 13.061 SQLite 2.7.6 (nosync): 0.223 Because it does not have a central server to coordinate access, SQLite must close and reopen the database file, and thus invalidate its cache, for each transaction. In this test, each SQL statement is a separate transaction so the database file must be opened and closed and the cache must be flushed 1000 times. In spite of this, the asynchronous version of SQLite is still nearly as fast as MySQL. Notice how much slower the synchronous version is, however. SQLite calls fsync() after each synchronous transaction to make sure that all data is safely on the disk surface before continuing. For most of the 13 seconds in the synchronous test, SQLite was sitting idle waiting on disk I/O to complete.

Test 2: 25000 INSERTs in a transaction

BEGIN; CREATE TABLE t2(a INTEGER, b INTEGER, c VARCHAR(100)); INSERT INTO t2 VALUES(1,59672,'fifty nine thousand six hundred seventy two'); ... 24997 lines omitted INSERT INTO t2 VALUES(24999,89569,'eighty nine thousand five hundred sixty nine'); INSERT INTO t2 VALUES(25000,94666,'ninety four thousand six hundred sixty six'); COMMIT; PostgreSQL: 4.900 MySQL: 2.184 SQLite 2.7.6: 0.914 SQLite 2.7.6 (nosync): 0.757 When all the INSERTs are put in a transaction, SQLite no longer has to close and reopen the database or invalidate its cache between each statement. It also does not have to do any fsync()s until the very end. When unshackled in this way, SQLite is much faster than either PostgreSQL and MySQL.

doegox commented 8 years ago

When unshackled in this way, SQLite is much faster than either PostgreSQL and MySQL.

Ok so better to focus on improving Sqlite transactions than moving to sth else, good to know!

gitttt commented 8 years ago

That is what they say at least. But I am still not sure what the best solution is in my case: Using a simple file or (an optimized) SQLite database.

doegox commented 8 years ago

Hmm I just checked, sqlitetrace is already surrounding all the inserts in one big BEGIN/COMMIT

doegox commented 8 years ago

What do you mean by using simple file write versus optimized SQLite? There is already the TracerGrind binary file format, texttrace and sqlitetrace. Binary file format is clearly the fastest, but when it comes to using tracegraph, we need to convert it to a DB otherwise we would have to load all the info in memory at once to navigate through the data.

One improvement in the todo list is to avoid to go through the human readable text format when it comes to recording multiple traces with deadpool_dca so if TracerPIN and a helper of TracerGrind could directly output the DCA traces that would be much better.

gitttt commented 8 years ago

I made my last post more clear (see above).

doegox commented 8 years ago

So, what is your case? :)

gitttt commented 8 years ago

I want traces similar to your traces. But I also need the register writes.

I care about performance during runtime, i.e. the slowdown should be as little as possible. After the tracing is done, the performance does not matter any more. In other words, I don't care if the final write of the (complete) trace from memory to disk takes long.

Again, similar to your use case, the recorded trace should be further processed (with python). Therefore the trace should end up in a database eventually.

I am currently in favor of the sqlite solution because

It may(?) be fast enough during runtime - given some optimization tricks
I could avoid creating structs and parsing them again.

What do you think?

doegox commented 8 years ago

And you want to use PIN instead of Valgrind? I would suggest to implement a binary file format as it's done in the Valgrind plugin to dump the trace as fast as possible in a file. Bonus points if it's roughly compatible with texttrace and sqlitetrace so we can reuse the same workflow.

doegox commented 8 years ago

Just look at the numbers above: Recording the binary trace took 4.5s, then converting it to Sqlite took 37s. While recording directly into Sqlite during tracing took 38s.

gitttt commented 8 years ago

I started with PIN. I am at the point where I have a human readable trace with the infos I need.

You are right but I am not sure how much of the slowdown happened during runtime, i.e. while the tracing was still going on. My idea is to keep the entire sqlite db in memory until the very end and thus avoid the slowdown. Or do you think this won't help as writing to an in-memory sqlite db will still be significantly slower than writing to an in-memory file?

doegox commented 8 years ago

Honesty I've no idea but beware your limit becomes your RAM :) And again looking at Valgrind numbers, binary trace is 3x smaller than SQlite so maybe it's wiser to store a binary format in RAM and create the SQlite dump at the end.

gitttt commented 8 years ago

Good point. I will try to use your binary format. I just sent you an E-Mail.

gitttt commented 8 years ago

How much bigger is the PIN text trace compared to the binary valgrind trace?

doegox commented 8 years ago

See above: TracerPIN producing human-readable file: 23s, 1.3Gb TracerGrind producing its own binary format: 4.5s, 320Mb

gitttt commented 8 years ago

Why did you use a human-readable file for PIN but not for valgrind?

doegox commented 8 years ago

Two different developers, two different timeframes :) Retrospectively the Valgrind approach is better IMHO /cc @haxelion

gitttt commented 8 years ago

I would use your format to ease integration of my work but the valgrind code is really complex, e.g.:

void sendInfoMsg(UInt fd, InfoMsg *info_msg)
{
    uint8_t type = MSG_INFO;
    uint64_t length = 9; // msg header
    VG_(memcpy)((void*)msg_buffer, &type, 1);
    VG_(strcpy)((HChar*)&(msg_buffer[9]), info_msg->key);
    length += VG_(strlen)(info_msg->key)+1;
    ...

This is what I currently have as readable output:

------------------------------
0x7f344f52aa8a: mov eax, eax
------------------------------
0x7f344f52aa8a: W                eax size: 0x4        0x3646d8c5a

------------------------------
0x7f344f52aa8c: or rdx, rax
------------------------------
0x7f344f52aa8c: W                rdx size: 0x8    0x18b3a646d8c5a
0x7f344f52aa8c: W             rflags size: 0x4              0x206

------------------------------
0x7f344f52aa8f: mov rax, qword ptr [rip+0x21e3da]
------------------------------
0x7f344f52aa8f: R     0x7f344f748e70 size: 0x8                0xe
0x7f344f52aa8f: W                rax size: 0x8                0xe

------------------------------
0x7f344f52aa96: mov qword ptr [rip+0x21e1fb], rdx
------------------------------
0x7f344f52aa96: W     0x7f344f748c98 size: 0x8    0x18b3a646d8c5a

I don't know how to write these structs from trace_protocol.h into a file. A really really short example code snippet how to fill a file with a struct (from the trace_protocol.h file or whatever .h file) could speed things up a lot. In other words: How to "raw-write" a filled struct into a file?

It doesn't really matter for the snippet but, here is a relevant code part of mine:

static VOID LogRegWriteToFile(VOID * ip, CHAR r, REG reg, BOOL isPrefetch, const CONTEXT *ctxt)
{
    // For the integer registers, it is safe to use ADDRINT. But make sure to pass a pointer to it.
    ADDRINT regVal;

    TraceLogFile << ip << ": " << r << " " << setw(2+2*sizeof(ADDRINT)) << REG_StringShort(reg)
                 << " size: " << hex << setw(2) << REG_Size(reg) << " ";
    if (!isPrefetch)
        PIN_GetContextRegval(ctxt, (REG)reg, reinterpret_cast<UINT8*>(&regVal));
        TraceLogFile << setw(2+2*sizeof(ADDRINT)) << hex << regVal << endl;
}

doegox commented 8 years ago

Hmm it's @haxelion code so it's a bit hard for me to tell, but did you look at the code of texttrace? It might be easier for you to understand the format from there than in the Valgrind plugin.

gitttt commented 8 years ago

Maybe there is a simpler way than using functions like void sendInfoMsg(UInt fd, InfoMsg *info_msg) . But maybe you had your reasons to use it?

haxelion commented 8 years ago

First of all Valgrind does not allow to use the libc, so you can't do sqlite inside Valgrind (you can inside Pin, because a Pin process has 3 copies of the libc ...). More generally, when you're tracing a binary, I/O operations will always be the bottleneck because they are several magnitude slower to everything else. That's why I tried to move the maximum out of the tracing plugin.

The trace protocol is very simple, you should look into texttrace to understand it but it's simply a succession of structure defined in trace_protocol.h wrapped inside a common Type - Length - Value structure.

The complexity of the valgrind plugin code is due to three main factors:

I need to do some reordering of my events and link them together which requires a message queue.
I use a write buffer which is very good for performances but more complex.
Valgrind does not work well with dynamic memory allocations inside the callbacks so I have to manage my own memory pool allocator for the queue.

Now we use a database because the binary trace is too big to fit into memory and we can't do queries on a binary trace if it means scanning the whole file. Binary traces are useless, they are just the fastest way to dump the info out of the process. Concerning the database choice:

SQLite is very fast when you're doing non-concurrent accesses, which is our case.
SQLite is very simple to deploy for people.

As you can see I did optimize writes to the database in sqlitetrace. Maybe there's more to do and if you have a patch with improved performances I would gladly merge it.

SideChannelMarvels / Tracer

Just wondering: Is the use of sqlite a performance issue? #10