inseven / opolua

A compiled-OPL interpreter for iOS written in Lua
https://opolua.org
MIT License
12 stars 0 forks source link

Robustness issues with `RecursiveDirectoryMonitor` #178

Closed jbmorley closed 2 years ago

jbmorley commented 2 years ago
  1. There are races lead to changes being missed. Specifically, if directories are created while during directory indexing but before we've started up up a DirectoryMonitor for the directory in which the new directory is created 😮‍💨). This can lead to programs failing to detect during install as there are a number of nested directories created in quick succession. The specific install-time issue is being tracked as #136, and it's likely that we'll fix that by using a dedicated notification at install time to guarantee that new files are never missed.
  2. The directory monitor is incredibly chatty when files are changing as it schedules a new full directory scan for each and every change with no rate limiting. Since the scans can take a long time to complete, this means that the directory monitor can still be scanning many minutes after an update has occurred. I was originally tracking this as #153.
  3. The current approach to directory monitoring requires us to open a new file handle for each-and-every directory in the tree structure; this can rapidly balloon and it's very easy to hit the OS limits if someone adds a large external location via open-in-place (e.g., my fairly empty 'Documents' directory has roughly 500 nested directories). The new implementation of RecursiveDirectoryMonitor has a shared instance which is able to share the monitoring across clients which does go some way to mitigating this, but right now it's still possible to crash the app if you add too big a directory, with no recovery mechanism. This was originally being tracked in #130 (where there's some discussion and ideas), but I thought it would be useful to pull the issues together to get a real sense of the monitoring situation.
  4. A specific and clear manifestation of the broad issues described in 1. occurs during startup; since the monitoring is set up asynchronously, there's currently no way for clients to known when monitoring has started so that they can scan the directory and guarantee that they won't miss any changes (spoiler: they can't anyway right now). It'd be great to tackle this as part of any solution to 1. This was being tracked independently as #149.
jbmorley commented 2 years ago

I've made a couple of changes to address some of these issues:

Note that the second change is also guaranteed to always notify newly added observers once observation has started, as observer notification has been moved into queue_updateSubDirectoryMonitors (called during initial setup). Once 1. is resolved by setting up recursive observers more systematically, we should be able to guarantee that no change is missed following a call to add an observer.

jbmorley commented 2 years ago

I've also added a 0.1s delay when scheduling queue_updateSubDirectoryMonitors (see b1255be0ad34addd53d0dd8bb2a65bcb76308901) to mitigate the worst effects of rapid file writes and directory changes.

jbmorley commented 2 years ago

Current status:

  1. 👎🏻 -- We still need a more systematic walk to set up and start the sub-directory DirectoryMonitor instances to ensure we don't miss changes.
  2. 👍🏻 -- The rate limiting in RecursiveDirectoryMonitor should be good enough to address these issues; there are still secondary observers that do work as a result of a change (e.g., Directory, and ProgramDetector) which might cause us problems and necessitate their own rate limits. In that case, it probably makes sense to pull in something similar to BRURateLimiter (see https://github.com/jbmorley/BromiumCoreUtils/blob/master/BromiumCoreUtils/BRURateLimiter.m)
  3. 👎🏻 -- Right now this can still result in a crash. At the very least we should fail gracefully if the number of open handles ever gets large. As to longer-term solutions, ... 🤷🏻‍♂️.
  4. 👍🏻 -- Pretty sure we're now good once 1. is fixed.
jbmorley commented 2 years ago

Is it necessary to add a rate limiter to ProgramDetector and Directory?

jbmorley commented 2 years ago

Graceful failures have been added in b781db2aceb655c6ef55b72d23f8732d8018c595, leaving just 1. to be completed to bring this directory monitoring approach up to about the best we can manage.

jbmorley commented 2 years ago

As of 5e29a464abb74c5e847fa2b93bb8676438ef001a, the remaining start-up race conditions should have been addressed (🤞🏻). This just leaves the question of whether we want to individually rate limit updates in ProgramDetector, Directory, and UbiquitousDownloader.

jbmorley commented 2 years ago

Closing as done. We've put in significant performance improvements and I think the observer now seems to be pretty robust.