lwouis / alt-tab-macos

Windows alt-tab on macOS
https://alt-tab-macos.netlify.app
GNU General Public License v3.0
11.16k stars 333 forks source link

Display video of the windows instead of static screenshots #122

Open lwouis opened 4 years ago

lwouis commented 4 years ago

Since v2, we have been using the CGSHWCaptureWindowList api to capture window images. I tried to refresh them on a timer at screen refresh-rate to simulate video. My POC worked but performance was an issue, when going beyond 30 fps, it becomes laggy and 60fps is not achieved.

I noticed the introduction of a new private API in 10.15: SLSHWCaptureStreamCreateWithWindow. The name seems to imply that it may be given us a video stream. The name having HW for HardWare in it also implies that it would be a high-performance stream straight from the GPU, before compositing.

@koekeishiya helped me guess the signature in an email exchange:

From what I can tell the function you are interested in has the following signature:

CGDisplayStreamRef SLSHWCaptureStreamCreateWithWindow(uint32_t cid, uint32_t wid, CFDictionaryRef display_stream_options, dispatch_queue_t queue)

(CGDisplayStreamRef may not be the correct return type, but I assume it might be, as the return value is the result of something called SLDisplayStreamCreate, which I found to very likely be related to https://developer.apple.com/documentation/coregraphics/1455170-cgdisplaystreamcreate?language=objc

The following is a screenshot of how this looks in IDA:

image

I tried with Hopper and noticed that there is a 5th argument:

image

Looking at CGDisplayStream docs, it seems clear that there is a need for a code block argument that is the callback that gets called on each frame of the display stream. I thus tried running the following code:

@_silgen_name("SLSHWCaptureStreamCreateWithWindow")
func SLSHWCaptureStreamCreateWithWindow(_ cid: CGSConnectionID, _ wid: CGWindowID, _ displayStreamOptions: CFDictionary, _ queue: DispatchQueue, _ block: CGDisplayStreamFrameAvailableHandler?) -> CGDisplayStream
let displayStream = SLSHWCaptureStreamCreateWithWindow(cgsMainConnectionId, CGWindow.windows(.optionOnScreenOnly)[0].id()!, [:] as CFDictionary, DispatchQueue.global(), nil)
        debugPrint(displayStream)

This code works! We get a CGDisplayStream object:

<CGDisplayStream 0x7fb1c5d10ad0>

The issue now is that while the code block being nil doesn't crash, without a block, there is nothing we can do. For instance this is how the public API works:

let displayStream = CGDisplayStream(
                dispatchQueueDisplay: CGMainDisplayID(),
                outputWidth: 800,
                outputHeight: 600,
                pixelFormat: Int32(k32BGRAPixelFormat),
                properties: nil,
                queue: DispatchQueue.global()) { (status, displayTime, frameSurface, updateRef) in
            debugPrint(status, displayTime, frameSurface, updateRef)
        }
        debugPrint(displayStream)
        displayStream?.start()

This works and the debugPrint(status, displayTime, frameSurface, updateRef) prints multiple lines

Here in our private API case, when I try:

let displayStream = SLSHWCaptureStreamCreateWithWindow(cgsMainConnectionId, CGWindow.windows(.optionOnScreenOnly)[0].id()!, [:] as CFDictionary, DispatchQueue.global(),
                { (status, displayTime, frameSurface, updateRef) in
                    debugPrint(status, displayTime, frameSurface, updateRef)
                })
        debugPrint(displayStream)
        displayStream.start()

I get Process finished with exit code 138 (interrupted by signal 10: SIGBUS)

I'm posting these notes in case somehow can help me find out with the last block argument is triggering a SIGBUS when provided. Maybe it's not of type CGDisplayStreamFrameAvailableHandler. How can I find the type then? I played a while with Hopper but I can't reverse-engineer the correct signature.

koekeishiya commented 4 years ago

I've played around a bit with this and verified that the return value is indeed a reference to a CGDisplayStream. I would therefore assume that you are correct that the closure it accepts as the 5th argument is of type CGDisplayStreamFrameAvailableHandler. For what it's worth, I am able to call the function and provide a closure, without crashing, but the block does not appear to run either, for whatever reason.

Here is a full sample on the off chance that someone wants to mess around with this:

#include <CoreGraphics/CGDisplayStream.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

extern int SLSMainConnectionID(void);
extern CGDisplayStreamRef SLSHWCaptureStreamCreateWithWindow(int cid, uint32_t wid, CFDictionaryRef options, dispatch_queue_t queue, CGDisplayStreamFrameAvailableHandler handler);
extern CGError SLDisplayStreamStart(CGDisplayStreamRef stream);

extern CFArrayRef SLSCopyWindowsWithOptionsAndTags(int cid, uint32_t owner, CFArrayRef spaces, uint32_t options, uint64_t *set_tags, uint64_t *clear_tags);
extern CGError SLPSGetFrontProcess(ProcessSerialNumber *psn);
extern CGError SLSGetConnectionIDForPSN(int cid, ProcessSerialNumber *psn, int *process_cid);
extern uint64_t SLSGetActiveSpace(int cid);

uint32_t *front_process_window_list_for_active_space(int *count)
{
    ProcessSerialNumber front_psn;
    int front_cid;

    int cid = SLSMainConnectionID();
    uint64_t sid = SLSGetActiveSpace(cid);

    SLPSGetFrontProcess(&front_psn);
    SLSGetConnectionIDForPSN(cid, &front_psn, &front_cid);

    uint32_t *window_list = NULL;
    uint64_t set_tags = 0;
    uint64_t clear_tags = 0;

    CFNumberRef space_id_ref = CFNumberCreate(NULL, kCFNumberSInt32Type, &sid);
    CFArrayRef space_list_ref = CFArrayCreate(NULL, (void *)&space_id_ref, 1, NULL);
    CFArrayRef window_list_ref = SLSCopyWindowsWithOptionsAndTags(cid, front_cid, space_list_ref, 0x2, &set_tags, &clear_tags);
    if (!window_list_ref) goto err;

    *count = CFArrayGetCount(window_list_ref);
    if (!*count) goto out;

    window_list = malloc(*count * sizeof(uint32_t));

    for (int i = 0; i < *count; ++i) {
        CFNumberRef id_ref = CFArrayGetValueAtIndex(window_list_ref, i);
        CFNumberGetValue(id_ref, CFNumberGetType(id_ref), window_list + i);
    }

out:
    CFRelease(window_list_ref);
err:
    CFRelease(space_list_ref);
    CFRelease(space_id_ref);
    return window_list;
}

void run_sl_stream(void)
{
    int window_count;
    uint32_t *window_list = front_process_window_list_for_active_space(&window_count);

    if (!window_list) {
        printf("%s: could not get windows of front process! abort..\n", __FUNCTION__);
        return;
    }

    CGDisplayStreamRef sl_stream = SLSHWCaptureStreamCreateWithWindow(SLSMainConnectionID(), window_list[0], NULL, dispatch_get_main_queue(),
    ^(CGDisplayStreamFrameStatus status, uint64_t time, IOSurfaceRef frame, CGDisplayStreamUpdateRef ref) {
        printf("%s: Got frame: %llu\n", __FUNCTION__, time);
    });

    // Uncomment to view type (and the pointer value) of sl_stream
    // CFShow(sl_stream);

    if (SLDisplayStreamStart(sl_stream) != kCGErrorSuccess) {
        printf("error: failed to start SLSHW..stream\n");
        exit(EXIT_FAILURE);
    }
}

void run_display_stream(void)
{
    CGDisplayStreamRef display_stream = CGDisplayStreamCreateWithDispatchQueue(CGMainDisplayID(), 1280, 720, 'BGRA', NULL, dispatch_get_main_queue(),
    ^(CGDisplayStreamFrameStatus status, uint64_t time, IOSurfaceRef frame, CGDisplayStreamUpdateRef ref) {
        printf("%s: Got frame: %llu\n", __FUNCTION__, time);
    });

    // Uncomment to view type (and the pointer value) of display_stream
    // CFShow(display_stream);

    if (CGDisplayStreamStart(display_stream) != kCGErrorSuccess) {
        printf("error: failed to start streaming main display\n");
        exit(EXIT_FAILURE);
    }
}

int main(int argc, char **argv)
{
    // Uncomment to initiate a capture stream of the main display using the public CG API
    // run_display_stream();
    run_sl_stream();
    CFRunLoopRun();
    return 0;
}

Compile using:

clang main.c -o main -F/System/Library/PrivateFrameworks -framework SkyLight -framework Carbon
galli-leo commented 4 years ago

I think this is what would allows this to work very well: https://avaidyam.github.io/2018/02/17/CAPluginLayer_CABackdropLayer.html

lwouis commented 4 years ago

Oh this is interesting! I actually experimented with this at the very beginnings of the project! I didn't know anything about private APIs at that time though, so I had something working, but quickly replace it with the Quartz APIs.

I'll revisit this and see if it can unlock live thumbnails! Thanks for sharing @galli-leo

lwouis commented 4 years ago

Unfortunately, I tested the above CAPluginLayer private API on macOS 10.15.6 and it was not working. Maybe it broke with Catalina? I tried to manually give the apps Accessibility and Screen Recording permissions, but it didn't help. We can see the the windows bounds are known, and the menubar is properly shown, but the windows contents are not shown.

Diorama localscreenshare my own test
image image image
lwouis commented 4 years ago

I just tested in a VM, and the method seems to work on macOS 10.14. Could someone else perhaps test on their 10.15 system to confirm that it doesn't work anymore?

It's as easy as running this app.

koekeishiya commented 4 years ago

The creator of the blog post made a demo and posted to his github: https://github.com/avaidyam/Diorama This method has been patched in Catalina and will only allow you to stream the contents of windows that your application owns. You can verify this by running the application and opening the menubar while your application has focus.

Edit: Just noticed your screenshots above showcasing this.

lwouis commented 4 years ago

Yes this API stopped working in 10.15. It is a grim reminder that any private API we rely on to enable our useful apps can disappear with Big Sur or the next release. Not a great place to be in for us, but nothing we can do about it. My hope is that some Apple employees use our apps, and advocate for opening the APIs their require. For example, adding a public API to deal with Spaces. It boggles the mind that your can't write an app, using stable public APIs, that knows about the user Spaces after the feature being a core part of macOS for 10 years.

lwouis commented 4 years ago

Flagging with "big sur" to remember to check if they have new private APIs or (one can dream) public APIs we could use once Big Sur is released.

lwouis commented 2 years ago

New API in macOS 12 to capture video of windows: https://developer.apple.com/documentation/screencapturekit

Here's a recap of my experimentations with it.