Open lwouis opened 4 years ago
I just tested in a macOS 12.4 VM, and HS still detects windows on other Spaces after launch. There is also no visible flicker or visual glitch on launch.
@koekeishiya had previously found that HS uses the invisible windows trick to show windows of other Spaces, until the user navigates to these Spaces, then they have the AXref.
Anyone knows of an example of a window where the CG and AX APIs report a different title for instance? I would like to confirm that the windows shown by HS are really from the CG API, on boot, or if they somehow use another trick we don't know about yet.
The fact that HS works on 12.4 proves that they don't use the add/remove private APIs, but it doesn't tell us with certainty how they show the windows.
I've been experimenting to try and replace CGSAddWindowsToSpaces
and CGSRemoveWindowsToSpaces
. Here are some candidates I tried, with my notes for each:
API | Result | Notes |
---|---|---|
func CGSMoveWindowsToManagedSpace(_ cid: CGSConnectionID, _ windows: NSArray, _ space: CGSSpaceID) -> Void |
It moves the window to another Space. However it doesn't work on fullscreen windows. | |
func CGSShowSpaces(_ cid: CGSConnectionID, _ sids: NSArray) -> Void |
Shows windows of the Spaces, but their AXrefs are not obtainable. The windows are only visually there. | |
func CGSManagedDisplaySetCurrentSpace(_ cid: CGSConnectionID, _ displayUuid: CFString, _ sid: CGSSpaceID) -> Void |
Bring all windows of the Space onto the current display. It also brings fullscreen windows. However, it break the Spaces on that display as they get assigned to the current Space. They "stack" on Space 1, and there is no way to send them back to their original Space. | See more details |
func CGSProcessAssignToSpace(_ cid: CGSConnectionID, _ pid: pid_t, _ sid: CGSSpaceID) -> CGError |
Bringing a process to a Space brings its windows there, so that's good. However, it doesn't bring fullscreen windows. | Also tried CGSProcessAssignToAllSpaces but same limitations. |
func CGSMoveWorkspaceWindowList(_ cid: CGSConnectionID, _ windowList: CFArray, _ windowCount: UInt, _ sid: CGSSpaceID) -> OSStatus |
The call compiles and runs, but only returns OSStatus 1001 which is kCGErrorIllegalArgument . I called it with something like CGSMoveWorkspaceWindowList(cgsMainConnectionId, [60930], 1, Spaces.currentSpaceId) which seems to match other calls on Github, but always 1001. |
It seems it was used in Chromium, but I can't find usages in Chromium latest |
func CGSSetWorkspace(_ cid: CGSConnectionID, _ sid: CGSSpaceID) -> OSStatus |
Link error: Undefined symbol: _CGSSetWorkspace |
I tried with CGSSetWorkspace , SLSSetWorkspace , or _CGSSetWorkspace , but none of them seem defined. Looking at the SDK symbols for macOS 10.10, I don't see it. I see only _CGSSetWorkspaceForWindow . |
func CGSSetWindowWorkspace(_ cid: CGSConnectionID, _ wid: CGWindowID, _ sid: CGSSpaceID) -> CGError |
Link error: Undefined symbol: _CGSSetWindowWorkspace |
It seems to also no longer exist in the SDK |
func CGSSetWorkspaceForWindow(_ cid: CGSConnectionID, _ wid: CGWindowID, _ sid: CGSSpaceID) -> CGError |
Link error: Undefined symbol: _CGSSetWorkspaceForWindow |
I tried also _CGSSetWorkspaceForWindow . It seems to also no longer exist in the SDK |
func CGSSpaceAddWindowsAndRemoveFromSpaces(_ cid: CGSConnectionID, _ sid: CGSSpaceID, _ wid: NSArray, _ notSure: Int) -> Void |
Correctly move windows to the given Space. Works with fullscreen windows. However, it messes with macOS internals, and after moving back a fullscreen window to its original Space, that Space will be fully black for instance. Furthermore, it doesn't work on macOS 12.2 |
Potential things to look into:
CGSAddWindowToWindowMovementGroup
/ CGSCopyWindowGroup
/ _CGSGetWorkspaceWindowGroup
. Maybe after attaching to a window on the current Space, it becomes accessible for AXref? Then we would detach the window and have it back on its Space somehow?I've looked again at decompiled HS, and it's clearer now that they use the CG API to list windows from other Spaces, until they can get the AXref later on. It's kind of proved by looking at how they handle closing and minimizing windows. Everything in the OCWindow
class:
Minimizing, actually doesn't work if you have never visited the window's Space. The code reflects that:
/* @class OCWindow */
-(char)minimize {
r14 = self;
rax = [self axWindow];
if (rax != 0x0) {
rax = AXUIElementCopyAttributeValue(rax, @"AXMinimizeButton", &var_18);
if (rax != 0x0) {
if (*(int32_t *)dword_10017eecc > 0x0) {
rbx = 0x0;
NSLog(@"Couldn't find minimize button for: %@", r14);
}
else {
rbx = 0x0;
}
}
else {
AXUIElementPerformAction(var_18, @"AXPress");
CFRelease(var_18);
*(int32_t *)(r14 + 0x14) = 0x0;
rbx = 0x1;
}
}
else {
rbx = 0x0;
}
rax = rbx & 0xff;
return rax;
}
Closing works, even without ever visiting the window's Space, because they use the :
/* @class OCWindow */
-(char)close {
r14 = self;
rax = [self axWindow];
rbx = rax;
if (rax == 0x0) {
[r14 moveToCurrentSpace];
rax = [r14 axWindow];
rbx = rax;
if (rax != 0x0) {
var_20 = 0x0;
rax = AXUIElementCopyAttributeValue(rbx, @"AXCloseButton", &var_20);
if (rax != 0x0) {
if (*(int32_t *)dword_10017eecc >= 0x2) {
NSLog(@"Couldn't get close button for: %@!", r14);
}
if (AXUIElementPerformAction(rbx, @"AXRaise") != 0x0) {
rax = 0x0;
}
else {
var_28 = [r14 ownerPSN];
CGEventSetFlags(CGEventCreateKeyboardEvent(0x0, 0xd, 0x1), 0x100000);
CGEventPostToPSN(&var_28, rax);
CFRelease(rax);
CGEventSetFlags(CGEventCreateKeyboardEvent(0x0, 0xd, 0x0), 0x100000);
CGEventPostToPSN(&var_28, rax);
CFRelease(rax);
usleep(0x30d40);
rax = 0x1;
}
}
else {
AXUIElementSetMessagingTimeout(var_20, intrinsic_movss(xmm0, *(int32_t *)float_value_1));
rbx = AXUIElementPerformAction(var_20, @"AXPress");
CFRelease(var_20);
rax = rbx == 0x0 ? 0x1 : 0x0;
}
}
else {
rax = 0x0;
}
}
else {
var_20 = 0x0;
rax = AXUIElementCopyAttributeValue(rbx, @"AXCloseButton", &var_20);
if (rax != 0x0) {
if (*(int32_t *)dword_10017eecc >= 0x2) {
NSLog(@"Couldn't get close button for: %@!", r14);
}
if (AXUIElementPerformAction(rbx, @"AXRaise") != 0x0) {
rax = 0x0;
}
else {
var_28 = [r14 ownerPSN];
CGEventSetFlags(CGEventCreateKeyboardEvent(0x0, 0xd, 0x1), 0x100000);
CGEventPostToPSN(&var_28, rax);
CFRelease(rax);
CGEventSetFlags(CGEventCreateKeyboardEvent(0x0, 0xd, 0x0), 0x100000);
CGEventPostToPSN(&var_28, rax);
CFRelease(rax);
usleep(0x30d40);
rax = 0x1;
}
}
else {
AXUIElementSetMessagingTimeout(var_20, intrinsic_movss(xmm0, *(int32_t *)float_value_1));
rbx = AXUIElementPerformAction(var_20, @"AXPress");
CFRelease(var_20);
rax = rbx == 0x0 ? 0x1 : 0x0;
}
}
rax = rax & 0xff;
return rax;
}
We also see functions meant to match AX windows with CGWindow references:
We can confirm that they compare AXref title with CGref title for instance:
/* @class OCWindow */
-(char)matchToAxWinByTitle:(struct __AXUIElement *)arg2 {
r12 = arg2;
r13 = self;
rbx = [self cgTitle];
rdx = [NSCharacterSet controlCharacterSet];
rax = [rbx stringByTrimmingCharactersInSet:rdx];
r14 = rax;
if (rax == 0x0) {
r14 = 0x0;
if ([[r13 ownerName] isEqualToString:@"App Store"] != 0x0) {
r14 = @"App Store";
}
}
var_30 = 0x0;
rdx = &var_30;
AXUIElementCopyAttributeValue(r12, @"AXTitle", rdx);
rbx = var_30;
if (rbx != 0x0) {
rbx = [rbx stringByTrimmingCharactersInSet:[NSCharacterSet controlCharacterSet]];
[var_30 release];
rax = [r14 isEqualToString:rbx];
rcx = 0x1;
if (rax == 0x0) {
rcx = 0x0;
}
}
else {
rcx = 0x0;
}
rax = rcx & 0xff;
return rax;
}
Also there is a function called createDummySpaceWindow
which seems like how they would create the invisible windows used to switch Space.
Also interesting, HS has a function called isUsualUserWindow
which filters out windows, similar to AltTab's isActualWindow
. They check only 2 things:
They also have some hardcoded checks like owner is Dock
, or Safari
or Microsoft Office
, etc. Lots of hardcoded cases.
I'm considering ditching the private API tricks, and doing like HyperSwitch: using CG API, and living with a dual-accounting.
I'm trying to compare the pros and cons, so listing here so I can refer to it later:
CGSSpaceGetType
or the window bounds.Also, I'm wondering if we could re-explore the AppleScript APIs. Maybe we could use it to do the actions or extract the info. I remember that the original POC for AltTab was using AppleScript to focus window. It's a shame I didn't use git for this POC.
Anyway, I'll probably explore that. There are many issues with AppleScript, such as: Permissions, no window ID (at least not CGWindowID), performance, maybe other limitations, etc.
Need to explore the various ways to interact with AS: osascript
, NSUserScriptTask
, NSUserAppleScriptTask
, NSAppleScript
, OSAKit
, etc.
Update: nevermind, I confirmed that AppleScript can't interact with windows on other Spaces. It can see them and get their info, but not send command.
# this shows data of the window on another Space
tell application "Finder"
get properties of first window
end tell
# this fails to show data of the window on another Space
# but we need to go through "System Events" to call "AXRaise"
tell application "System Events" to tell process "Finder"
get properties of first window
perform action "AXRaise" of first window
end tell
I'm considering ditching the private API tricks, and doing like HyperSwitch: using CG API, and living with a dual-accounting.
If we are going to that way, maybe the PR(#1484 ) has some useful codes.
@koekeishiya I just found out https://github.com/tonyarnold/virtuedesktops. It uses a Dock extension. It's pretty old code / APIs, so I'm thinking it may old approaches that we wouldn't think of these days.
I reviewed other apps that deal with windows to see if they handle other Spaces. They all fail: Witch, Contexts, WindowSwitcher, OptimalLayout, uBar.
Only HyperSwitch handles it. And AltTab, until macOS 12.2.
I reviewed other apps that deal with windows to see if they handle other Spaces. They all fail: Witch, Contexts, WindowSwitcher, OptimalLayout, uBar.
I used Contexts, it can switch to other window on other space or quit app on other space by cmd+q. But it can not handle close window and other window action. I think the most frequently used action is switch window. So I think it's acceptable.
I used Contexts, it can switch to other window on other space
Keep in mind the ticket we are in. Try to first open a window on another Space. Then come back to the main Space, and open Contexts. Now notice how it is not aware of that window until you visit that Space. Of course or you open the window after Contexts, it will know about it. The issue is when the window existed before
Yes, I know, I specifically first open a window on another Space, it really show that window on the window list and I can switch to that window. I'm on macOS 12.4.
I just tried again, and it doesn't work for me. Latest Contexts v3.8.1, on macOS 10.15.
Could you please record a video on your machine? I'm still questioning that it would work, since it clearly doesn't work for me.
@lwouis
https://user-images.githubusercontent.com/3339872/171548100-b8d7d825-7585-4632-a7c5-2079abf227a4.mp4
I decompiled Contexts a few weeks ago, and it also uses invisible windows(called helperWindow) to implement the switch window feature.
I wonder if you had quit Contexts properly before you recorded the video. Because Contexts has no visible UI to show that it's running. You have to quit it in its preferences before running the experiment.
Look at what happens on my machine:
https://user-images.githubusercontent.com/106195/171554507-8cd71909-38cd-4775-8ba2-5f14b301740e.mp4
Notice how:
This confirms that until the user visits the Space, they don't have windows data. And when you focus "Finder", they activate the Finder app, which shows one of its window.
I did close Contexts.
https://user-images.githubusercontent.com/3339872/171563624-f336dfd4-32bc-4bb5-9a1a-da927c606e81.mp4
Very strange, I do work here. My current configuration looks like this:
But I found that sometimes it would take a long time to switch to fullscreen space or failed(just sometimes). For non-fullscreen windows, it works fine.
In your example, you use 2 windows which don't have their own name. So I still think Contexts is just showing you that your apps were open.
Could you try exactly the same use-case as in my video?
Desktop
Applications
Does it list "Finder", or does it list 2 windows "Desktop" and "Applications"?
In your example, you use 2 windows which don't have their own name. So I still think Contexts is just showing you that your apps were open.
Could you try exactly the same use-case as in my video?
- Open a Finder window with
Desktop
- Send it to Space 2
- Open a Finder window with
Applications
- Send it to Space 3
- From Space 1, open Contexts
- Press command+tab
Does it list "Finder", or does it list 2 windows "Desktop" and "Applications"?
In your use-case, my test result is same to you.
Recap of the saga so far:
CGSAddWindowsToSpaces
API which AltTab needs to access other-Space windowsI'm running out of ideas or areas to explore. Any help would be very welcome ๐โโ๏ธ
There's another app in the ecosystem - TotalSpaces - that seems to get window data. Or, at least, it's able to do window drags and window screenshots from a fresh start.
I'll see if I can figure out how it's working...
@jkelleyrtp it seems TotalSpaces requires the user to disable SIP: https://totalspaces.binaryage.com/installing-mojave
So maybe they inject the Dock like yabai. Not a solution that will work for AltTab casual userbase, unfortunately.
Not with TotalSpaces3 - I am actually working on it and there is no SIP disable required. My screenshot is from the TS3 Beta released on the binaryAge forums.
I agree, for the purposes of this software it is not an ideal solution.
My naive idea would instead be to focus every space once during startup, calling the AX API to retrieve refs -- once for each space, and then re-focus the original space again. I think this should work fine, but there will be a short span of visual flicker during first launch. Not sure how acceptable that is, but it should be able to detect all windows.
You'd need a combination of the following API's to do this:
extern void CGSManagedDisplaySetCurrentSpace(int cid, CFStringRef display_ref, uint64_t spid); extern uint64_t CGSManagedDisplayGetCurrentSpace(int cid, CFStringRef display_ref); extern CFArrayRef CGSCopyManagedDisplaySpaces(const int cid); extern CFStringRef CGSCopyManagedDisplayForSpace(const int cid, uint64_t spid); extern void CGSShowSpaces(int cid, CFArrayRef spaces); extern void CGSHideSpaces(int cid, CFArrayRef spaces);
This process would likely have to happen for each connected monitor:
# https://developer.apple.com/documentation/coregraphics/1454603-cggetactivedisplaylist CGGetActiveDisplayList(display_count, result, count); # convert a CGDisplayID to a CFStringRef (UUID) used by the above spaces API CFStringRef display_uuid(uint32_t did) { CFUUIDRef uuid_ref = CGDisplayCreateUUIDFromDisplayID(did); if (!uuid_ref) return NULL; CFStringRef uuid_str = CFUUIDCreateString(NULL, uuid_ref); CFRelease(uuid_ref); return uuid_str; }
@lwouis I think the solution @koekeishiya mentioned is feasible. You also mentioned some of the problems you encountered with this method. But I think we can avoid that by using invisible window trick if we have problems by using those CGS Space APIs. Here is my solution:
I think using that animation like a loading window can help us to avoid the animation when switching spaces. For users, it's a better UX.
@metacodes it's hard for me to understand what you describe. I can't run TotalSpaces3 because it's Apple Silicon only, as far as I can tell, and I don't have an AS machine. Maybe you could record some of the flows you mention, to share with us what this app does, and how we could maybe copy some of their techniques?
@jkelleyrtp @metacodes FYI, I sent an email to Stephen (his email was on the top left on this blog post). I asked him if he would be willing to share the technique he's using with TS3. I hope he's willing to share his knowledge~
@lwouis You can use Keynote.app to understand the fullscreen animation I mentioned. When you play a keynote with fullscreen, you can not switch space by your trackpad or others. The Keynote.app's fullscreen mode is different from other apps. Maybe this screen remains unchanged if we show an animation with this fullscreen mode when we switch the space in the background.
@metacodes yes Keynote creates a window that takes all the screen space, but that's not a native fullscreen. Other apps do that like firefox (see #558), some video players, games, etc.
Ok so what you're suggesting is:
On launch, AltTab essentially obscures the user's screen, like putting curtains in front of the screen. While the screen is hidden, AltTab manually switches to all Spaces one by one, to capture AXrefs. Then, AltTab stops obscuring the screen.
I think it's overall a bad UX to make the computer unusable for a little while. I see a lot of issues with this:
@jkelleyrtp @metacodes FYI, I sent an email to Stephen (his email was on the top left on this blog post). I asked him if he would be willing to share the technique he's using with TS3. I hope he's willing to share his knowledge~
He replied and they are using CG APIs + SLSMoveWindowsToManagedSpace
, so they simply don't support the fullscreen windows scenarios i guess. I wish i could play around with TS3 but i don't have an AS mac.
Today I did more testing on how fullscreen windows actually work. We support closing, minimizing, de-fullscreening them, from another Space. Stuff that macOS won't let you do otherwise. I realized that it creates weird artifacts:
Scenario | Behavior |
---|---|
Close a fullscreen window from another Space | The window quickly flashes on the current Space, then is closed |
Minimize a fullscreen window from another Space | The window quickly flashes on the current Space, then it actually works, surprisingly, even though macOS disables the yellow "minimize" button if you go on that Space to minimize with the mouse |
De-fullscreen a fullscreen window from another Space | The window quickly flashes on the current Space, then is nowhere to be seen. Its Space is destroyed. You can still get the window back by right-clicking on its app's Dock icon, then selecting that window. It's still open, just not accessible on any Space directly. That behavior is pretty bad UX |
Hide (an app with) a fullscreen window from another Space | Nothing happens for that window. Non-fullscreen windows of that app are hidden. This one is weird even with native macOS UI interactions. Fullscreen windows don't get hidden when you hide an app. |
A note on close and minimize: we first set kAXFullscreenAttribute
to false
, and only then we send the close/minimize event. For the minimize event, we wait 1s (hardcoded duration :/), because otherwise macOS ignores the command to minimize, as it's still doing the de-fullscreen animation.
Conclusion: I think that dealing with fullscreen windows in general is a broken experience on macOS. Same with AltTab. Maybe we could just give up on fullscreen window, and always bring the user to their Space before doing any action on them. That way we always get the AXref before acting. Then for non-fullscreen windows, we can use SLSMoveWindowsToManagedSpace
to bring them to the current Space before sending a command, to get the AXref. Alternatively, we could do bring them to the current Space in advance, when they are spawned, so that later we don't need to.
The advantage of bringing the windows early is that then we have the AXref to show title, remove non-windows, and do commands from the current Space. The downside is that it flashes those windows for the user (maybe there is a way to hide them temporarily?). And vice-versa for the other approach.
I found this function in SkyLight: _SLSPackagesAssignDraggedWindowToDestinationSpace(int arg0, int arg1, int arg2, int arg3, int arg4, int arg5)
. It seems to be still available on Catalina.
@koekeishiya @jkelleyrtp @metacodes Do you know the complete signature of that API?
A slight deviation: I see Apple released a new kit this year in WWDC, which might be helpful to detect windows. The kit is ScreenCaptureKit. This is kind of a misuse, but one of its APIs, SCShareableContent
, seems perfect for grabbing all information of all windows of all displays.
(I guess) All we need is:
I'm not sure if this works because I didn't try it and I'm not a Swift developer, so this is just a suggestion. I will try this API and write a minimal demo when I'm not busy. If anyone wants to try out, go ahead.
@ifsheldon I'm afraid, ScreenCaptureKit
is barely wrapping the existing CG/CGS APIs. The data it returns for windows is quite limited: https://developer.apple.com/documentation/screencapturekit/scwindow
I think it could perhaps be used for #122, where I also suggested it. But it would not solve the issues discussed in this ticket here, as this new API doesn't provide us with the Accessibility window reference we need to focus/miniaturize/close windows. I also expect it would return the same windows as CGWindowListCopyWindowInfo
. Notice how similar the parameters are to SCShareableContent.getExcludingDesktopWindows
.
Perused this thread and have some thoughts, some of which might be worthwhile. I have no experience with the accessibility APIs discussed here though, so keep that in mind.
1: If I follow the thread correctly, it's assumed that AXUIElement can be passed from process to process (e.g., from a daemon to the main app). I'm not sure if this is true, so if it is true, assume for the below points we're passing AXUIElement (or some intermediate type that AXUIElement can be reconstructed from), and if not true, then assume we're passing a dictionary of window related values.
2: I think a login item / helper app could work well here at least as a partial solution. While it doesn't launch before login as you wanted with a daemon, it can be launched on login, run in the background windowless with no menu bar or dock item and can survive the termination and relaunch of the main app. Therefore even without launching before login, it's better positioned to have more of the data you need than the main app would be. It also inherits the permissions granted to the main app which can be helpful. Also, in another thread, it was mentioned that the AX calls are blocking and that this can be a problem. If the AX calls are made in a separate process, then this likely isn't an issue.
Good tutorial on login items: https://martiancraft.com/blog/2015/01/login-items/ Btw, https://developer.apple.com/documentation/servicemanagement/smappservice looks interesting.
3: The approach I'd suggest here that's a bit simpler than using XPC is to setup a user defaults suite that is shared b/w the helper and the main app. The helper app would store which spaces and windows it has encountered in the shared user defaults and the main app would use that as a backup for any spaces it has not yet encountered. Again, I'm not sure if this would be storing a representation that allows for reconstruction of an AXUIElement or just a dictionary of basic window information. Additionally you could persist screenshots to disk and store URLs to those screenshots in the user defaults.
The main app would check to get a list of the current spaces, and check its own memory to see if it has the window/AXUIElement information needed for these spaces and if not check the shared user defaults for information for any spaces it hasn't yet navigated to. In many/most cases, either the main app or helper app would have the information for all spaces on screen, in which case all is good.
In cases where the shared user defaults doesn't already have all information needed from all spaces, then trigger existing solution for getting that data for only the spaces with that missing data. Whatever difficulties exist with that solution, at least this approach should minimize the occurrence of needing to use it.
@brettstover AltTab already launches at login, so it can monitor things in the same capacity as the alternative solution you describe. It's already multi-threaded to avoid blocking. The only downtime is during upgrades where it restarts and lost context. But having a background service wouldn't solve that since that service would need to restart on upgrade as well. So it's more of a topic of serializing the state on disk either way. And we don't do that today because there could be differences before/after and we don't control when AltTab is back. Could be minutes and windows could be shuffled in between.
Hello, all. I'm new to this conversation, so if I'm saying something which has already been suggested, please feel free to let me know. If you're not opposed to continuing use of the private CoreGraphics framework, I've found a trick that works extremely well:
macOS continuously registers a new keyboard accelerator each time a new desktop/space is created. The accelerator isn't activated unless previously enabled by the user in System Preferences. However, using the function CGSSetSymbolicHotkeyEnabled(int, BOOL)
, you can activate the accelerator programmatically. What's more, since you're calling a method and not writing to a PLIST store, the change is instantly registered by the CoreGraphics keyboard events listener.
Determining what space you want for which window can be done by first querying CGSCopySpacesForWindows(CGSConnectionID id, int spacesMask, CFArrayRef windows)
. This will give you the CGSpaceID
for a CGWindowID
.
To resolve the CGSpaceID into the space's ordered index or human-readable desktop number, you can use CGSCopyManagedDisplaySpaces(CGSConnectionID id)
to get an ordered list of all spaces. Flat-map the spaces from each CGDisplay
entity, and then find the index of your CGSpaceID
from the previous step. You now have the zero-based desktop number of the target space, add 1
to turn this into the human-readable desktop number.
Finding the correct hotkey to focus the target space can be done by reading-in the com.apple.symbolichotkeys
PLIST. Switching for numbered desktops begins with desktop 1
at index 118
. Add the zero-based desktop number to 118
to determine which symbolic hotkey you'll need to enable, and then call CGSSetSymbolicHotkeyEnabled(int, BOOL)
to engage the listener โย using the PLIST-resolved index as the first arg and YES
as the second.
The parameters
value of each entity in com.apple.symbolichotkeys.plist
is structured as follows:
(
{{ ascii_value_of_keyboard_glyph_if_applicable }},
{{ osascript_key_code_of_keyboard_key }},
{{ bitwise_nxkeymask_of_modifier_keys }}
)
Resolve the NXKeyMask
value, into the modifier keys, and then use either System Events
or CGEventPost
to dispatch the keyboard event. The space will snap into focus, and you can now use AXUIElementCreateApplication
to make your desired window frontmost โ after which you can activate the target process using [[NSRunningApplication runningApplicationWithProcessIdentifier: {{ PID }}] activateWithOptions: 2]
.
This solution is working extremely well for me. I'm not an Objective-C programmer, and I pieced this together as best I could through a lot of trial and error, so there may be places where efficiency can improve. As stated before, if I'm missing something, please feel free to point it out to me. I can post a working copy if you'd like to see some code.
@stephancasas this is interesting information. Thank you for sharing.
As i understand, this would allow us to switch to specific Spaces. We can do this already, in a simpler way actually. We have a strategy where we spawn invisible windows in every Space. We can then focus them to force macOS to focus that Space. More info in the last bullet point of this recap.
The issue with switching Space is that it's visible for the user. It disturbs their work when we start going Space by Space to visit.
How fast is your method at visiting Space? Could we call it like 10 times in a row for 10 Spaces, really quickly, so the user sees only a "flash" on-screen?
@lwouis I may have misunderstood the initial issue. Is the aim to find a different way of navigating to a space once a window is selected, or to find a different way of getting thumbnails for windows which are on other spaces?
What I've described would only be useful in the former, not the latter.
@stephancasas this ticket is about the following topic.
When windows are on other Spaces than the active Space, we can't get their AXref, which is the technical structure that lets us do many things with them (e.g. focus them, minimize them, get their title, get their screenshot, etc).
What we were doing before Monterey was to use a private API to instantly teleport all windows on the active Space. Then we would grab their AXref, then we would teleport them back in their original Spaces. From the user perspective, they would open AltTab, and see quick flash on screen, sometimes barely noticeable, then AltTab would list all windows nicely.
The API which teleports windows is broken in Monterey onwards. This ticket investigate alternatives.
We could ask the user to visit all Spaces manually, or we could visit them automatically on launch, but all these solutions make for a bad UX. We are looking for something more invisible to the user, that would let us grab the AXref somehow.
@lwouis @koekeishiya @metacodes Maybe instead of just focusing on already known private APIs we could analyse the code stack of the macOS dock and see if there are any private APIs that are not discovered yet.
Found these two articles about how to reverse engineer macOS APIs:
The discussion above did not focus on known APIs; it included looking at basically every symbol exported by the SkyLight.framework, which is the interface to the WindowServer.
I don't remember exactly every detail that was attempted in this discussion, but the core of the issue is:
To focus a window, you need a reference through the AX API. This is the only way to focus a specific window on macOS (unless you inject code into the Dock, which requires SIP to be disabled).
To get an AX reference for a window, that window must be on a currently visible space.
The workaround in alttab that worked for older versions of macOS was to detect windows using private APIs and move them to the currently active space, so that they would be eligible for usage through the AX API.
The discussion above did not focus on known APIs; it included looking at basically every symbol exported by the SkyLight.framework
I was not talking known APIs!
To focus a window, you need a reference through the AX API. This is the only way to focus a specific window on macOS (unless you inject code into the Dock, which requires SIP to be disabled).
Yes I know, you inject code into the Dock App (e.g. in window_manager_focus_window_without_raise
right?) if you don't have th axuiref, in Swift its an object of AXUIElement.
I think the Dock App uses an API that we can expose where you use the injection.
@koekeishiya Did you take a look at the assembly code of the Dock App to find out which memory addresses to use? If yes, did you not find any APIs used that we can expose?
Did you take a look at the assembly code of the Dock App to find out which memory addresses to use? If yes, did you not find any APIs used that we can expose?
Yes I did, and no there is no API that does what alttab needs, that work on the newest version of macOS.
--
window_manager_focus_window_without_raise
is not code injection; it simply sends bytes to a specific application based on an event protocol that I figured out by instrumenting code using Frida.re. This alone is not enough to fully focus a window, and must be used in combination with the AX API. It is used to work around a bug that makes the AX API not focus the correct window in a multi-monitor setup.
I am not going to go into details here, but basically every GUI application on macOS register themselves with the Dock (this is part of Carbon/Cocoa); setting up an event handler and a mach port for communication. The Dock runs the server part, and applications connect and give the Dock communication rights. The Dock then uses this mach port to signal an application (using the process serial number and window id) to make a specific window the key-window (focused window). You can hook into this part, but as I said it requires injecting code into the Dock's process space, which requires SIP to be disabled. I have hooked this function for use in yabai many years ago.
Yes I did, and no there is no API that does what alttab needs I am not going to go into details here ...
Well, thats a lot of detail already, thanks ๐
And I suppose we can not "expose" the Dock App source code functions, because this is only possible for shared libraries, right?
Anyway, I want to analyze the Dock App myself, therefor I have to disable SIP also i suppose.
Is your feature suggestion related to a problem? Please describe. When AltTab starts, there is a flash-of-content as windows from other Spaces are temporarily brought in the current space through a private API. This is needed to be able to focus them later. However, it is janky as it confuses the user with the flashing, and is also limited in power as it has a 1s budget to try and grab the windows, after which windows which were not grabbed will not be known to AltTab.
Describe the solution you'd like HyperSwitch is able to focus windows from other Spaces after starting. It does not flash content doing so, so they must have a better way.