objective-see / BlockBlock

BlockBlock provides continual protection by monitoring persistence locations.
GNU General Public License v3.0
619 stars 38 forks source link

Optimize event processing performance #69

Closed art-divin closed 7 months ago

art-divin commented 7 months ago

Resolves #68; Resolves #65

Description

This PR addresses for-loop iterations not being concurrent, as well as resolves a couple of issues with the project, specifically - Xcode static code analyzer showed a bug, and Sentry needed to be updated due to an issue with the linker in later versions of Xcode/clang.

Context

In my attempt to resolve issues while Xcode is running swift compiler, I made a small investigation, and the following was uncovered:

  1. when Daemon is running, in its main function Monitor instance is created
  2. then -[Monitor start] is called from the same main function of the Daemon
  3. in -[Monitor start] there is the following code:
    FileCallbackBlock block = ^(File* file)
    {
    ...
            //process file event
            [self processEvent:file];
        }
    };
    ...
    //start monitoring
    started = [self.fileMon start:events count:sizeof(events)/sizeof(events[0]) csOption:csNone callback:block];
  4. for every file registered by the Monitor (are all files in the system registered, or only some? I do not know exactly), callback is executed. It means that during compilation in Xcode, every file that is being created, written etc. produces these events. Xcode compilation happens in parallel on different cores, thus produces hundreds if not thousands of events.
  5. Events arrive via the callback block to -[Monitor processEvent:]
  6. Upon processing of the event, a "plugin" is being looked up in all registered "plugins":
    // ...that cares about the path/file that was just created
    plugin = [self findPlugin:file];
  7. found "plugin" is an instance of PluginBase, the same I have mentioned in the beginning
  8. message -[PluginBase isMatch:] is sent then, which uses regex to match against the given "event", where "event' is a file system event

As can be seen in the sample I have provided in the previous message, ICU library is responsible for such enormous CPU consumption.

Conclusion

Due to the nature of file processing, it is not that easy to optimize the process when there is such big bulk operation with file reads/writes occurs in the system.

That being said, there are a few techniques which might improve the situation not only for Xcode, but for all other applications/file operations as well:

  1. use parallel enumeration when evaluating "plugin" against the given event
  2. use parallel enumeration when searching for a "plugin" in all registered plugins
  3. exit early in case the event should be "ignored"
art-divin commented 7 months ago

Should also resolve #10

art-divin commented 7 months ago

OK, seems I have resolved the conflicts.