mediar-ai / screenpipe

rewind.ai x cursor.com = your AI assistant that has all the context. 24/7 screen & voice recording for the age of super intelligence. get your data ready or be left behind
https://screenpi.pe
MIT License
8.84k stars 505 forks source link

implement windows audio output #127

Closed louis030195 closed 2 months ago

louis030195 commented 3 months ago

https://github.com/thewh1teagle/vibe/blob/main/desktop/src-tauri/src/cmd/audio.rs

louis030195 commented 3 months ago

if anyone on windows would like to help

currently using a cloud VM - idk if i can plug some virtual audio device somehow (lot of friction)

louis030195 commented 2 months ago

some context:

code is here: https://github.com/louis030195/screen-pipe/blob/main/screenpipe-audio/src/core.rs

code here said to work on windows output: https://github.com/thewh1teagle/vibe/blob/main/desktop/src-tauri/src/cmd/audio.rs but dont see much diff with us

atm issue faced when trying to use windows audio output:

[2024-08-19T15:04:46Z ERROR screenpipe_server::core] Error in record_and_transcribe for device Headphones (JBL ENDURANCE PEAK 3) (output) (iteration 1): The requested stream type is not supported by the device., stopping thread
louis030195 commented 2 months ago

/bounty 100

algora-pbc[bot] commented 2 months ago

💎 $100 bounty • Louis Beaumont

Steps to solve:

  1. Start working: Comment /attempt #127 with your implementation plan
  2. Submit work: Create a pull request including /claim #127 in the PR body to claim the bounty
  3. Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

Thank you for contributing to louis030195/screen-pipe!

Add a bounty • Share on socials

Attempt Started (GMT+0) Solution
🟢 @chandeldivyam #196
chandeldivyam commented 2 months ago

@louis030195 I am able to process windows output. Doesn't work for mac although!

https://github.com/chandeldivyam/samwise/blob/master/src-tauri/src/audio_processor/audio_recorder.rs

I would love to help!

Can you let me know how to reproduce:

[2024-08-19T15:04:46Z ERROR screenpipe_server::core] Error in record_and_transcribe for device Headphones (JBL ENDURANCE PEAK 3) (output) (iteration 1): The requested stream type is not supported by the device., stopping thread
louis030195 commented 2 months ago

@chandeldivyam

@louis030195 I am able to process windows output. Doesn't work for mac although!

https://github.com/chandeldivyam/samwise/blob/master/src-tauri/src/audio_processor/audio_recorder.rs

I would love to help!

Can you let me know how to reproduce:

[2024-08-19T15:04:46Z ERROR screenpipe_server::core] Error in record_and_transcribe for device Headphones (JBL ENDURANCE PEAK 3) (output) (iteration 1): The requested stream type is not supported by the device., stopping thread

macos audio output works on mac < 15.0

Screenshot 2024-08-22 at 09 47 25

to reproduce:

  1. build the project
  2. screenpipe --list-audio-devices
  3. screenpipe --audio-device "myaudio device" you should get an error there

feel free to implement unit test for this (ignored in CI because we don't have way to simulate audio device in CI atm)

chandeldivyam commented 2 months ago
(base) PS D:\projects\screen-pipe> .\target\release\screenpipe.exe --audio-device "Headphones (4- High Definition Audio Device)"
[2024-08-22T09:26:08Z WARN  screenpipe] Screenpipe hasn't been extensively tested on this OS. We'd love your feedback!
Would love your feedback on the UX, let's a 15 min call soon:
https://cal.com/louis030195/screenpipe
thread 'main' panicked at screenpipe-server\src/bin/screenpipe-server.rs:181:52:
Failed to parse audio device: Device type (input/output) not specified in the name

Stack backtrace:
   0: std::backtrace_rs::backtrace::dbghelp64::trace
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library\std\src\..\..\backtrace\src\backtrace\dbghelp64.rs:91
   1: std::backtrace_rs::backtrace::trace_unsynchronized
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library\std\src\..\..\backtrace\src\backtrace\mod.rs:66
   2: std::backtrace::Backtrace::create
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library\std\src\backtrace.rs:331
   3: std::backtrace::Backtrace::capture
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library\std\src\backtrace.rs:296
   4: anyhow::error::<impl anyhow::Error>::msg
   5: screenpipe_audio::core::AudioDevice::from_name
   6: screenpipe_audio::core::parse_audio_device
   7: screenpipe::main::{{closure}}
   8: tokio::runtime::park::CachedParkThread::block_on
   9: tokio::runtime::context::runtime::enter_runtime
  10: tokio::runtime::runtime::Runtime::block_on
  11: screenpipe::get_base_dir
  12: std::sys_common::backtrace::__rust_begin_short_backtrace
  13: std::rt::lang_start::{{closure}}
  14: std::rt::lang_start_internal
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library\std\src\rt.rs:141
  15: main
  16: invoke_main
             at D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:78
  17: __scrt_common_main_seh
             at D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
  18: BaseThreadInitThunk
  19: RtlUserThreadStart
stack backtrace:
   0: std::panicking::begin_panic_handler
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library\std\src\panicking.rs:652
   1: core::panicking::panic_fmt
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library\core\src\panicking.rs:72
   2: core::result::unwrap_failed
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library\core\src\result.rs:1654
   3: screenpipe::main::{{closure}}
   4: tokio::runtime::park::CachedParkThread::block_on
   5: tokio::runtime::context::runtime::enter_runtime
   6: tokio::runtime::runtime::Runtime::block_on
   7: screenpipe::get_base_dir
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Thanks, I am able to reproduce this, will update soon!

louis030195 commented 2 months ago

you need to specify (output) (eg. just copy paste exact audio device given in cli)

chandeldivyam commented 2 months ago

Thanks! able to reproduce it

chandeldivyam commented 2 months ago

@louis030195 Is there a way I can run the it with a debugger for deeper understanding. Kind of a development environment setup.

{
    "version": "0.2.0",
    "configurations": [
        {
            "type": "lldb",
            "request": "launch",
            "name": "Debug screenpipe",
            "cargo": {
                "args": [
                    "build",
                    "--bin=screenpipe",
                    "--package=screenpipe-server"
                ],
                "filter": {
                    "name": "screenpipe",
                    "kind": "bin"
                }
            },
            "args": ["--debug"],
            "cwd": "${workspaceFolder}"
        }
    ]
}
chandeldivyam commented 2 months ago

Got the issue:

Will share the resolution.

screenpipe-audio\src\core.rs Line 175 let mut config = audio_device.default_input_config()?;

We are first trying to assign the config to default_input_config and then mutating it. When we do this, the thread panics in windows.

let config = if is_output_device && cfg!(target_os = "windows") {
        audio_device.default_output_config()?
    } else {
        audio_device.default_input_config()?
    };

This is how we should handle it. Tried it locally, it is working. Need to run relevant test cases and see what are the checks for raising a PR. Will share it soon.

image

@louis030195

louis030195 commented 2 months ago

@louis030195 Is there a way I can run the it with a debugger for deeper understanding. Kind of a development environment setup.

{
    "version": "0.2.0",
    "configurations": [
        {
            "type": "lldb",
            "request": "launch",
            "name": "Debug screenpipe",
            "cargo": {
                "args": [
                    "build",
                    "--bin=screenpipe",
                    "--package=screenpipe-server"
                ],
                "filter": {
                    "name": "screenpipe",
                    "kind": "bin"
                }
            },
            "args": ["--debug"],
            "cwd": "${workspaceFolder}"
        }
    ]
}

usually i use debug mode in unit test by clicking the debug button (and adding breakpoints)

image

(although on mac running into issues atm, usually it works, if by any chance you know how to give Cursor/VSCode env var that would be used in the "Debug" button would be very helpful, required for Apple Native OCR)

louis030195 commented 2 months ago

Got the issue:

Will share the resolution.

screenpipe-audio\src\core.rs Line 175 let mut config = audio_device.default_input_config()?;

We are first trying to assign the config to default_input_config and then mutating it. When we do this, the thread panics in windows.

let config = if is_output_device && cfg!(target_os = "windows") {
        audio_device.default_output_config()?
    } else {
        audio_device.default_input_config()?
    };

This is how we should handle it. Tried it locally, it is working. Need to run relevant test cases and see what are the checks for raising a PR. Will share it soon.

image

@louis030195

love it!

please send a PR :)

louis030195 commented 2 months ago

PS: if by any chance you can share how to sat up your windows dev env on the #dev-windows channel on discord that would be very helpful for other devs that run into issues

chandeldivyam commented 2 months ago

PS: if by any chance you can share how to sat up your windows dev env on the #dev-windows channel on discord that would be very helpful for other devs that run into issues

Yes, I will share it. There were bunch of issues. With ffmpeg and setting so many different environment variables.

I did set a few things. Majorly issue with ffmpeg and vcpkg.

Actually, I did set a few things, some useful, some not. And didn't close the terminal. Let me start from fresh and will create a guide how to exactly setup.

algora-pbc[bot] commented 2 months ago

💡 @chandeldivyam submitted a pull request that claims the bounty. You can visit your bounty board to reward.

algora-pbc[bot] commented 2 months ago

🎉🎈 @chandeldivyam has been awarded $100! 🎈🎊