Fix for inconsistent detection of apps when Spotlight index is not ready

mreid-tt commented 1 year ago

This PR addresses the bug report https://github.com/will-stone/browserosaurus/issues/572. It includes the following code changes from v19.3.3:

Add logic to compare app paths from 'find' with 'mdfind' to verify results
Add Node.js File System Module to verify if the user has an Applications folder
Replace recursion logic for getInstalledAppIds() with loop with known end state

will-stone commented 1 year ago

Hey. Thanks for your perseverance but unfortunately using find doesn’t work as there can be folders in the Applications folder, and increasing the maxdepth just gives false positives as it detects apps within apps.

mreid-tt commented 1 year ago

I now understand the issue you raise. You are correct in that I didn't consider apps being present in sub-folders. There is an easy fix... rather than restricting the search using -maxdepth we can instead -prune the search by not exploring folders which have the same naming pattern as apps. The PR has been updated accordingly.

will-stone commented 1 year ago

Oooo nice one. I’ll take a look when I get a chance (lots going on at the mo).

mreid-tt commented 1 year ago

I also did a minor refactor to the mdfind query. Based on previous tests (https://github.com/will-stone/browserosaurus/issues/572#issuecomment-1323345666), using "kMDItemKind == '*'" is more complete than using "kMDItemKind == 'Application'". I suspect because this specific attribute may not be in a first-order index. However looking at the man page it would seem that the spotlight keyword kind:app is a common query and thus should be in a first-order index.

Preliminary testing confirms that within one minute of boot, the results of "kMDItemKind == '*'" and kind:app are identical with the exception of non-apps (like the 'Utility' folder). As such, I've refactored the mdfind query and removed the exclusion for 'null' in the string parsing lower down.

will-stone commented 1 year ago

On my work machine, find finds /Applications/Zscaler/.Updater/autoupdate-osx.app whereas mdfind does not, meaning the count is out by one. Here's the commands used:

mdfind -onlyin /Applications -onlyin ~/Applications kind:app

find /Applications -iname "*.app" -prune -o -iname "*.app"

will-stone commented 1 year ago

If we could get find to yield bundle IDs then we wouldn't need mdfind/mdls?

mreid-tt commented 1 year ago

On my work machine, find finds /Applications/Zscaler/.Updater/autoupdate-osx.app whereas mdfind does not, meaning the count is out by one.

Thanks for the feedback. I've done some research and it seems that mdfind will not include results in hidden folders. As such, we can probably do the same with find. Please test the following:

find /Applications -iname "*.app" -prune -not -path "*/.*"

It seems to work for me but let me know so I can update the PR. As for bundle IDs, I doubt that find would be able to print this type of metadata. The bundle ID seems to be stored in an XML file within the app at *.app/Contents/Info.plist. I suppose we could write a longer command to take the apps detected and look for that key but that's a bit over my skill level at the moment.

will-stone commented 1 year ago

That yielded the same result 👍

mreid-tt commented 1 year ago

I'm thinking that there may be some esoteric results from find which we may not be able to think of. As such I am considering a way to make the scan function fail-safe. That is, it should use the apps it has by the last iteration of the loop and perhaps print something to the console. I'm going to push something to the PR and you can take a look to see if you like it.

will-stone commented 1 year ago

I'm thinking that find is probably the safest way to get installed apps. Therefore, it's the opening logic that can be changed. Instead of opening using the bundle ID, we can open using app name. e.g.

open -a Firefox.app "http://wstone.uk"

I'm going to explore that and converting the store.json to use app names. If you foresee there being an issue there, please let me know 🙂

mreid-tt commented 1 year ago

I've refined the fail-safe to the scan function to include a timer backoff based on the system uptime. This is because I expect the Spotlight index to be in a more consistent state the longer it has been since boot. At the same time I want to minimise the loop limit so that we can quickly treat with those esoteric situations from find I mentioned earlier (without the user waiting a long time for a re-scan). As for your consideration:

I'm thinking that find is probably the safest way to get installed apps. Therefore, it's the opening logic that can be changed. Instead of opening using the bundle ID, we can open using app name. I'm going to explore that and converting the store.json to use app names. If you foresee there being an issue there, please let me know 🙂

My understanding is that bundle IDs are far more stable than app names. For example I could rename my Firefox.app to Browser.app just fine but the bundle ID would remain the same. As such, I'm not sure on what basis your matching list would work.

will-stone commented 1 year ago

It’s all a trade off. A quicker, more stable app vs. satisfying others’ edge cases where they’ve renamed apps. I don’t rename apps, so I don’t need to support it 👍 Same as the million requests to support Chrome profiles: I don’t use Chrome so don’t need to support it.

If my endeavour pans out, I may look at adding a configuration option that allows people to tell B which app names it looks for: thus allowing B to support an infinite amount of apps without needing a release.

mreid-tt commented 1 year ago

Fair enough... In the meantime, would you consider releasing this PR as v9.3.4? I believe it would reliably address the bug that it was created to resolve. Also, I'd really like to see it closed so I stop continually obsessing about code revisions (serves me right for setting up my iMac to do dev work).

will-stone commented 1 year ago

If you can refactor it so that you don’t include any eslint disable then I’d be happy to take a look but it depends if I do my refactor first ☺️

mreid-tt commented 1 year ago

If you can refactor it so that you don’t include any eslint disable then I’d be happy to take a look but it depends if I do my refactor first ☺️

Refactor complete... awaiting your review.

EDIT: I don't fully agree with strictly removing all lines of eslint-disable-next-line no-await-in-loop. According to the documentation, there are legitimate reasons:

In many cases the iterations of a loop are not actually independent of each-other. For example, [...] loops may be used to retry asynchronous operations that were unsuccessful. [...] In such cases it makes sense to use await within a loop and it is recommended to disable the rule via a standard ESLint disable comment.

If you would consider this scenario, I would much rather merge the previous commit.

will-stone commented 1 year ago

I haven’t fully gone through the code but you probably want something like Promise.all. I can take a look when I have time but it won’t be until next week now. Thanks for all the contributions so far.

mreid-tt commented 1 year ago

No need to respond now but I've done some additional research. Based on what I've found, there does not seem to be a way to refactor the loop to avoid the no-await-in-loop rule. Here's why:

The documentation for the no-await-in-loop advises running all the iterations in parallel and waiting for their collective Promise.all() in an array. This will not work for the loop in question because the sleep() function must wait to be completed before we trigger another loop. Otherwise all the parallel functions will find the Spotlight index in a similar state
Another solution presented was to execute all iterations in parallel with different delays (via setTimeout() timers). While it is likely that over the range of timer values the Spotlight index will be in a consistent state, the problem is that multiple threads will likely resolve successfully. This would result in multiple calls to dispatch(retrievedInstalledApps(installedApps)) and from what I've seen, the scan never completes

Now there are many good reasons why you should not have asynchronous calls in a loop (as noted in this article), but for our use case, I believe it is necessary if we want to have a predictable end state (controlling the number of iterations).

The alternate approach which I pushed in the last commit basically removes all the asynchronous functions and make them all synchronous. Because the same rules require each async function have an await statement you either have to do async all the way down or use some dummy await calls to avoid triggering the require-await rule. The issue with this approach is that with all the calls synchronous, some tests just after boot results in an unhandled exception elsewhere in the code. I get the following:

[903:1126/165722.744731:ERROR:network_service_instance_impl.cc(499)] Network service crashed, restarting service.
(node:903) UnhandledPromiseRejectionWarning: Error: ERR_FAILED (-2) loading 'http://localhost:3000/prefs_window'
    at rejectAndCleanup (node:electron/js2c/browser_init:165:7510)
    at EventEmitter.stopLoadingListener (node:electron/js2c/browser_init:165:7885)
    at EventEmitter.emit (node:events:539:35)

As such, I am pushing a reversion of the refactor (but including some of the loop optimisations). When you get a chance you can have a look and give your opinion on the above.

will-stone commented 1 year ago

Yeah await in for loop pretty much doesn’t do anything: the loop will keep going to next iteration whilst the async task happens. It’s like running Promise.all.

will-stone commented 1 year ago

As my refactor is going to touch a lot of code, I suggest just running package for your architecture and then dragging the output app to your Applications folder. If I feel that my method won’t cut it, I’ll revisit your PR. Please don’t take any offence to this, your ideas have helped me find better methods and speed up Browserosaurus. Thank you.

mreid-tt commented 1 year ago

Understood, it was a pleasure going on this learning journey with you. Thanks for all your guidance and after a bit of digging I was able to build the app using these instructions. I did have to do some fixes with npm audit fix as the initial build did not want to show the main screen once the app was running. I've configured the app in my launch items and will be using it for my day to day.

Happy hunting with your refactor and do feel free to reach out if you need any testing done.

will-stone commented 1 year ago

Thanks @mreid-tt 😊 I’ll PR it when it’s ready so you can give it a go

will-stone / browserosaurus

Fix for inconsistent detection of apps when Spotlight index is not ready #590