lks9 / src-tracer

Other
0 stars 0 forks source link

Cflow database #19

Closed ducorduck closed 1 year ago

ducorduck commented 1 year ago

replace json with sqlite with 2 table as suggested in #9

lks9 commented 1 year ago

Great, this gives enormous performance improvements to the instrumentation process! Apparently, loading and storing the json was what really slowed things down when comping musl or busybox.

lks9 commented 1 year ago

So when compiling a larger C project as with busybox and musl:

make

This works fast. However with

make -j8

I get after some time:

Traceback (most recent call last):
  File "/home/lukas/Software/angr-dev/src-tracer//instrumenter.py", line 30, in <module>
    instrumenter.parse(filename)
  File "/home/lukas/Software/angr-dev/src-tracer/src_tracer/instrumenter.py", line 336, in parse
    self.traverse(root)
  File "/home/lukas/Software/angr-dev/src-tracer/src_tracer/instrumenter.py", line 397, in traverse
    self.traverse(child, file_scope=file_scope)
  File "/home/lukas/Software/angr-dev/src-tracer/src_tracer/instrumenter.py", line 381, in traverse
    self.visit_function(node)
  File "/home/lukas/Software/angr-dev/src-tracer/src_tracer/instrumenter.py", line 111, in visit_function
    func_num = self.func_num(node)
  File "/home/lukas/Software/angr-dev/src-tracer/src_tracer/instrumenter.py", line 84, in func_num
    self.cursor.execute("INSERT INTO automatic_lookup VALUES(?,?,?,?)", data)
sqlite3.IntegrityError: UNIQUE constraint failed: automatic_lookup.num

(btw. don't try the busybox example yet; if you want to, I would make some effort to adapt the README first)

make -j8 has the effect that it starts 8 parallel processes for instrumentation and compiler. And then these parallel processes want to set the same num for different functions...

Solution: I have looked it up, apparently every sqlite table gets an automatic column rowid which can also serve as the key. So we can simply use the rowid as a function number.

CREATE TABLE IF NOT EXISTS function_list
                        (
                            file    TEXT,
                            line    INT,
                            name    TEXT
                        )

After we inserted new data, we can get the function number with:

self.cursor.execute('''
      SELECT rowid
      FROM function_list
      WHERE line=? and file=? and name=?
''', (line, file,name)).fetchone()
lks9 commented 1 year ago

I think some renaming is appropriate:

I think these steps could make the code and usage a bit cleaner.