Open peterstory opened 5 years ago
Although I haven't rigorously benchmarked the effect of this change, there appears to be a significant performance improvement from having a more specific regex (probably because there are fewer lines to sort). My unit test which includes the subprocesses described above went from about 3.4 seconds to 2.5 seconds.
I am already replacing the regex by python code. See my PR: https://github.com/Exodus-Privacy/exodus-core/pull/35
I'm not sure if it will make a practical difference on which trackers are detected, but I noticed that the regex here seems unnecessary broad, and will match on things other than class names: https://github.com/Exodus-Privacy/exodus-core/blob/685bab04fd44d3c0634e8a2448aad2e464a9b5aa/exodus_core/analysis/static_analysis.py#L147-L148
Here is a short example of what I mean. Here are the first 20 lines from running
dexdump
on WhatsApp:And here is the result of running the regex which the code is currently using. Note the matching on the superclass and on instance field types.
I've written a slightly different regex which is more specific, so it will only match on the
Class descriptor
:I would be curious to see whether this change has any downstream effects (i.e., on what trackers are detected). If this change looks good, I can include it in a PR, potentially addressing #7 at the same time.
EDIT: Fixed my regex. It didn't account for some class descriptors not including a $ or starting with a variable number of whitespace characters.