mediar-ai / screenpipe

Library to build personalized AI powered by what you've seen, said, or heard. Works with Ollama. Alternative to Rewind.ai. Open. Secure. You own your data. Rust.
https://screenpi.pe
MIT License
1.4k stars 108 forks source link

make screenpipe not requiring admin right on windows ($200) #320

Open louis030195 opened 4 days ago

louis030195 commented 4 days ago

/bounty 100

definition of done:

references:

linear[bot] commented 4 days ago

MED-98 make screenpipe not requiring admin right on windows ($100)

algora-pbc[bot] commented 4 days ago

💎 $100 bounty • Screenpi.pe

Steps to solve:

  1. Start working: Comment /attempt #320 with your implementation plan
  2. Submit work: Create a pull request including /claim #320 in the PR body to claim the bounty
  3. Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

Thank you for contributing to mediar-ai/screenpipe!

Add a bounty • Share on socials

louis030195 commented 3 days ago

/bounty 200

kerosina commented 3 days ago

Hello, running the example code for screen capture for xcap on windows didn't require admin. what specific feature of it does screenpipe use that requires admin?

kerosina commented 3 days ago

I tried out the code at https://github.com/mediar-ai/screenpipe/blob/main/screenpipe-vision/src/capture_screenshot_by_window.rs with a few alterations:

use image::DynamicImage;
use std::error::Error;
use std::fmt;
use std::println as error;
use std::time::Duration;
use tokio::time;
use xcap::{Monitor, Window, XCapError};

#[derive(Debug)]
enum CaptureError {
    NoWindows,
    XCapError(XCapError),
}

impl fmt::Display for CaptureError {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        match self {
            CaptureError::NoWindows => write!(f, "No windows found"),
            CaptureError::XCapError(e) => write!(f, "XCap error: {}", e),
        }
    }
}

impl Error for CaptureError {}

impl From<XCapError> for CaptureError {
    fn from(error: XCapError) -> Self {
        error!("XCap error occurred: {}", error);
        CaptureError::XCapError(error)
    }
}

pub async fn capture_all_visible_windows(
    monitor: &Monitor,
    ignore_list: &[String],
    include_list: &[String],
) -> Result<Vec<(DynamicImage, String, String, bool)>, Box<dyn Error>> {
    let mut all_captured_images = Vec::new();

    let windows = retry_with_backoff(
        || {
            let windows = Window::all()?;
            if windows.is_empty() {
                Err(CaptureError::NoWindows)
            } else {
                Ok(windows)
            }
        },
        3,
        Duration::from_millis(500),
    )
    .await?;

    let focused_window = windows
        .iter()
        .find(|&w| is_valid_window(w, monitor, ignore_list, include_list));

    for window in &windows {
        if is_valid_window(window, monitor, ignore_list, include_list) {
            println!("Got valid window {}", window.app_name());
            let app_name = window.app_name();
            let window_name = window.title();
            let is_focused = focused_window
                .as_ref()
                .map_or(false, |fw| fw.id() == window.id());

            match window.capture_image() {
                Ok(buffer) => {
                    let image = DynamicImage::ImageRgba8(
                        image::ImageBuffer::from_raw(
                            buffer.width() as u32,
                            buffer.height() as u32,
                            buffer.into_raw(),
                        )
                        .unwrap(),
                    );

                    all_captured_images.push((
                        image,
                        app_name.to_string(),
                        window_name.to_string(),
                        is_focused,
                    ));
                }
                Err(e) => error!(
                    "Failed to capture image for window {} on monitor {}: {}",
                    window_name,
                    monitor.name(),
                    e
                ),
            }
        }
    }

    Ok(all_captured_images)
}

fn is_valid_window(
    window: &Window,
    monitor: &Monitor,
    ignore_list: &[String],
    include_list: &[String],
) -> bool {
    let monitor_match = window.current_monitor().id() == monitor.id();
    let not_minimized = !window.is_minimized();
    let not_window_server = window.app_name() != "Window Server";
    let not_contexts = window.app_name() != "Contexts";
    let has_title = !window.title().is_empty();
    let included = include_list.is_empty()
        || include_list.iter().any(|include| {
            window
                .app_name()
                .to_lowercase()
                .contains(&include.to_lowercase())
                || window
                    .title()
                    .to_lowercase()
                    .contains(&include.to_lowercase())
        });
    let not_ignored = !ignore_list.iter().any(|ignore| {
        window
            .app_name()
            .to_lowercase()
            .contains(&ignore.to_lowercase())
            || window
                .title()
                .to_lowercase()
                .contains(&ignore.to_lowercase())
    });

    monitor_match
        && not_minimized
        && not_window_server
        && not_contexts
        && has_title
        && not_ignored
        && included
}

async fn retry_with_backoff<F, T, E>(
    mut f: F,
    max_retries: u32,
    initial_delay: Duration,
) -> Result<T, E>
where
    F: FnMut() -> Result<T, E>,
    E: Error + 'static,
{
    let mut delay = initial_delay;
    for attempt in 1..=max_retries {
        println!("Attempt {} to execute function", attempt);
        match f() {
            Ok(result) => {
                println!("Function executed successfully on attempt {}", attempt);
                return Ok(result);
            }
            Err(e) => {
                if attempt == max_retries {
                    error!("All {} attempts failed. Last error: {}", max_retries, e);
                    return Err(e);
                }
                println!("Attempt {} failed: {}. Retrying in {:?}", attempt, e, delay);
                time::sleep(delay).await;
                delay *= 2;
            }
        }
    }
    unreachable!()
}

#[tokio::main]
async fn main() {
    let monitors = Monitor::all().unwrap();
    for monitor in monitors {
        let res = capture_all_visible_windows(&monitor, &[], &[])
            .await
            .unwrap();
        println!("Took picture of {} window(s)", res.len())
    }
    println!("Finished");
}

and it worked without admin:

Attempt 1 to execute function
Function executed successfully on attempt 1
Got valid window Task Manager
Got valid window Visual Studio Code
Got valid window Windows Explorer
Got valid window Discord
Got valid window Vivaldi
Took picture of 5 window(s)
Finished

are you sure its required?

louis030195 commented 3 days ago

Try to build the app

kerosina commented 3 days ago

Try to build the app

when running it, all seems to be well, except after downloading models I get this error: thread 'tokio-runtime-worker' panicked at C:\Users\makedon\.cargo\registry\src\index.crates.io-6f17d22bba15001f\tokio-1.40.0\src\runtime\blocking\shutdown.rs:51:21: Cannot drop a runtime in a context where blocking is not allowed. This happens when a runtime is dropped from within an asynchronous context. If I restart it, I get the same panic once I try accessing the HTTP API.

I'll try looking into it tomorrow.

kerosina commented 3 days ago

Try to build the app

when running it, all seems to be well, except after downloading models I get this error: thread 'tokio-runtime-worker' panicked at C:\Users\makedon\.cargo\registry\src\index.crates.io-6f17d22bba15001f\tokio-1.40.0\src\runtime\blocking\shutdown.rs:51:21: Cannot drop a runtime in a context where blocking is not allowed. This happens when a runtime is dropped from within an asynchronous context. If I restart it, I get the same panic once I try accessing the HTTP API.

I'll try looking into it tomorrow.

I have found a fix for this runtime error and have opened a PR for it ( #330 )

kerosina commented 3 days ago

Are you sure this makes screenpipe not work on Windows? looking at xcap code:

        let file_version_info_size_w = GetFileVersionInfoSizeW(pcw_filename, None);
        if file_version_info_size_w == 0 {
            log_last_error("GetFileVersionInfoSizeW");

            return get_module_basename(box_process_handle);
        }

if getfileversioninfosizew fails, it logs the error but runs get_module_basename:

fn get_module_basename(box_process_handle: BoxProcessHandle) -> XCapResult<String> {
    unsafe {
        // 默认使用 module_basename
        let mut module_base_name_w = [0; MAX_PATH as usize];
        let result = GetModuleBaseNameW(*box_process_handle, None, &mut module_base_name_w);

        if result == 0 {
            log_last_error("GetModuleBaseNameW");

            GetModuleFileNameExW(*box_process_handle, None, &mut module_base_name_w);
        }

        wide_string_to_string(&module_base_name_w)
    }
}

since it doesnt log any error for GetModuleBaseNameW, I assume it worked. This shouldn't affect screenpipe

louis030195 commented 1 day ago

yes i have a windows computer and screenpipe fails if not ran as admin

kerosina commented 1 day ago

yes i have a windows computer and screenpipe fails if not ran as admin

Weird

at what step does it fail? what are the last log lines when it fails?

louis030195 commented 22 hours ago

https://github.com/nashaofu/xcap/issues/152

louis030195 commented 22 hours ago

when i start

louis030195 commented 22 hours ago

and starting the app as admin does not solve the issue (e.g. only when starting the CLI as admin), basically app only works in CLI atm, since we switched to window capture instead of just capturing the screen

louis030195 commented 22 hours ago

oh wait

so i have 2 windows:

so this issue is only on my windows machine, and i don't recall if anybody else faced this