Closed ducorduck closed 1 year ago
Great, this gives enormous performance improvements to the instrumentation process! Apparently, loading and storing the json was what really slowed things down when comping musl or busybox.
So when compiling a larger C project as with busybox and musl:
make
This works fast. However with
make -j8
I get after some time:
Traceback (most recent call last):
File "/home/lukas/Software/angr-dev/src-tracer//instrumenter.py", line 30, in <module>
instrumenter.parse(filename)
File "/home/lukas/Software/angr-dev/src-tracer/src_tracer/instrumenter.py", line 336, in parse
self.traverse(root)
File "/home/lukas/Software/angr-dev/src-tracer/src_tracer/instrumenter.py", line 397, in traverse
self.traverse(child, file_scope=file_scope)
File "/home/lukas/Software/angr-dev/src-tracer/src_tracer/instrumenter.py", line 381, in traverse
self.visit_function(node)
File "/home/lukas/Software/angr-dev/src-tracer/src_tracer/instrumenter.py", line 111, in visit_function
func_num = self.func_num(node)
File "/home/lukas/Software/angr-dev/src-tracer/src_tracer/instrumenter.py", line 84, in func_num
self.cursor.execute("INSERT INTO automatic_lookup VALUES(?,?,?,?)", data)
sqlite3.IntegrityError: UNIQUE constraint failed: automatic_lookup.num
(btw. don't try the busybox example yet; if you want to, I would make some effort to adapt the README first)
make -j8
has the effect that it starts 8 parallel processes for instrumentation and compiler. And then these parallel processes want to set the same num
for different functions...
Solution: I have looked it up, apparently every sqlite table gets an automatic column rowid
which can also serve as the key. So we can simply use the rowid
as a function number.
CREATE TABLE IF NOT EXISTS function_list
(
file TEXT,
line INT,
name TEXT
)
After we inserted new data, we can get the function number with:
self.cursor.execute('''
SELECT rowid
FROM function_list
WHERE line=? and file=? and name=?
''', (line, file,name)).fetchone()
I think some renaming is appropriate:
automatic_lookup
into function_list
(as in the comment above).manual_lookup
can stay the same.cflow_functions.db
into function_database.db
. My name cflow_functions.json
was already ugly.function_database
. function_database.db
instead of the directory.src_tracer/util.py
into src_tracer/functions.by
. And also, the sqlite connection, the creation of the tables, the INSERT
and SELECT
etc. should happen via method calls of the Functions
class (former Util
).I think these steps could make the code and usage a bit cleaner.
replace json with sqlite with 2 table as suggested in #9