rdp / virtual-audio-capture-grabber-device

free audio capture device to capture all the "wave out sound" that is playing on your speakers (i.e. record what you hear) for Windows Vista+. Releases downloadable in this package:
https://github.com/rdp/screen-capture-recorder-to-video-windows-free
Other
578 stars 181 forks source link

Update source_code/acam/loopback-capture.cpp #1

Closed taqattack closed 11 years ago

taqattack commented 11 years ago

By doing this, you can use the helper function to select which device to record from.

rdp commented 11 years ago

works for me, do you need a release with this?

taqattack commented 11 years ago

no thats fine. I'm gonna try to implement some stuff :)

taqattack commented 11 years ago

Hey roger, correct me if I'm wrong but are you're obtaining an audio buffer and then formatting it to wave buffer of 16-bit PCM audio buffer which is stored in pBufLocal which is a byte[]. I was wondering if it was possible to store/duplicate that buffer in a floating point array instead. I'm not too sure how to do that though but just a thought.

rdp commented 11 years ago

On Thu, Sep 20, 2012 at 1:54 PM, taqattack notifications@github.com wrote:

Hey roger, correct me if I'm wrong but are you're obtaining an audio buffer and then formatting it to wave buffer of 16-bit PCM audio buffer which is stored in pBufLocal which is a byte[]. I was wondering if it was possible to store/duplicate that buffer in a floating point array instead. I'm not too sure how to do that though but just a thought.

I'm not entirely sure what is doing the reformatting to 16-bit but yes, it is outputting as that.

I left it this way because I had some problems originally getting players to recognize the stream if I gave it to them in floating point format--I guess 16 bit is more common. I could try it again, do you know if floating point will give more accuracy? -r

taqattack commented 11 years ago

Well the issue isn't really floating point. I'm just wondering why the pData only ranges from 0 to 255 if its supposed to be 16-bit.

rdp commented 11 years ago

I did look into it today some and it appears the default "floating point" output from GetMixFormat is type wFormatTag==WAVE_FORMAT_EXTENSIBLE and subFormat is KSDATAFORMAT_SUBTYPE_IEEE_FLOAT and bits per sample is 32, so I suppose you are right we are losing precision there by converting to 16 bit PCM.

But, to answer your question, pData is a "byte array" because that's how directshow's frames are--it's an array of bytes that you stuff with some "appropriate" data. In this case, we're stuffing 16-bit audio data into it at a time, so every other byte is a new sample, if that makes sense.

I did try returning it KSDATAFORMAT_SUBTYPE_IEEE_FLOAT directly, as the directshow advertised stream, which resulted in FFmpeg:

Could not connect pins (VFW_E_NO_ACCEPTABLE_TYPES was the response message).

Even just switching it innocently to

pwfex->wFormatTag = WAVE_FORMAT_EXTENSIBLE;

instead of pwfex->wFormatTag = WAVE_FORMAT_PCM;

immediately causes FFmpeg to reject the stream, or actually, to not be able to build the graph. My guess is that IFilterGraph::DirectConnect, which is the method that is failing, is somehow not compatible with WAVE_FORMAT_EXTENSIBLE. What made this really weird is that if I run FFmpeg within visual studio (as a debugger), then DirectConnect miraculously works fine, but running it from the command line, it doesn't work with that error message. I think http://stackoverflow.com/questions/2347562/program-crashes-only-in-release-mode-outside-debugger explains why but I didn't get as far as testing it with windbg.

I do have one other idea to try to get it to work with floating point output, but I'm not sure if it'll work.

Another option would be to fix FFmpeg so that it can avoid using DirectConnect, but that is a whole different task.

HTH. -r

taqattack commented 11 years ago

Sorry Roger I think I may have given the wrong idea in my last post. And yes, FFmpeg isn't able to receive 32-bit floats from dshow interface.

I just realized IAudioCaptureClient->GetBuffer only outputs to bytearray, which is totally fine. My concern is trying to map out numerical values for the each byte of 16-bit pcm in pData. So for example 0x00000000 would map to most negative value in float/int variable. But it's only outputting between 0 and 255.

rdp commented 11 years ago

yeah I'm not sure there. My guess would be little endian based on http://wiki.multimedia.cx/index.php?title=PCM . FFmpeg also seems to prefer pcm_s16le which is little endian, so little endian is probably right. Thanks to this discussion I think I have figured out how to capture "up to 32 bit" audio from the sound card, though, see the 32bit branch. I have no idea how to test this, but it seems like a good idea somehow. If there's no complaint I'll probably merge it in about a week. Cheers and GL! -r

taqattack commented 11 years ago

32-bit does sound amazing! Too bad FLVs can only support upto 16-bit, 44khz for RTMP streaming. On a sidenote, I was finally able to understand how pData byte samples work. I'm gonna play around a bit to make the microphone loop through this filter. If you have any pointers, let me know!

rdp commented 11 years ago

If I were I would just have the mic as a separate dshow input to FFmpeg, then use amerge to join the two. Sounds easier for me than writing a merging filter. Or were you referring to just having it capture "from mic" instead of stereo mix? Yeah most things only use 16-bit but it might be an improvement for somebody somewhere, and hey, it might be useful some time or other :)

taqattack commented 11 years ago

Yeah I could do that but FFmpeg isn't compatible with some of the microphones that don't have standard dshow interface. It seems like I need to use WASAPI to loop that through vac filter.

On a sidenote: Some of the users seem to be having audio desync issues with virtual-audio-capture. Using virtual-audio-capture, some users experience an audio speedup, i.e., the audio starts to play faster than the video. It seems like it's running a bit faster than other devices. I've tested it out using Virtual Audio Cable and other microphones which seem to sync up properly with the video over long periods of time. I'm guessing it has to do with how often "Fillbuffer" is called but I'm not sure.

rdp commented 11 years ago

Are the users with problems using 48000 hz?

On Sun, Sep 23, 2012 at 9:55 AM, taqattack notifications@github.com wrote:

Yeah I could do that but FFmpeg isn't compatible with some of the microphones that don't have standard dshow interface. It seems like I need to use WASAPI to loop that through vac filter.

On a sidenote: Some of the users seem to be having audio desync issues with virtual-audio-capture. Using virtual-audio-capture, some users experience an audio speedup, i.e., the audio starts to play faster than the video. It seems like it's running a bit faster than other devices. I've tested it out using Virtual Audio Cable and other microphones which seem to sync up properly with the video over long periods of time. I'm guessing it has to do with how often "Fillbuffer" is called but I'm not sure.

— Reply to this email directly or view it on GitHubhttps://github.com/rdp/virtual-audio-capture-grabber-device/pull/1#issuecomment-8799674.

rdp commented 11 years ago

Also how are timestamps generated using your capture filter? Do users get the same problem with screen-capture-recorder? Is there a place on the forum I could/should go to try and debug this with users? -r

rdp commented 11 years ago

Also https://github.com/rdp/virtual-audio-capture-grabber-device/commit/cdd6f9e4e2562b07803ad08c52e547fc2d5bbb57 "might help" with the problem but I'd prefer to understand it before doing any real code changes. Cheers! -r