Open abettke opened 9 months ago
Hello, the SDK currently doesn't support receiving unmixed audio streams, but if you're interested in audio capture of the unmixed streams we recently launched support to publish to Kinesis Video Streams using Amazon Chime SDK media pipelines. You can find the developer guide for this feature here.
Please feel free to ask any questions.
I have a question to add to this - How do I know where the audio is and how to get them? The docs point to using the GetMedia
API, but that gets the chunks of data that is being received. Does KVS concat the chunks like the CreateMediaConcatenationPipeline
API? Or do we need to setup KVS to point to a S3 bucket?
BTW thanks for the links to the docs, very helpful 😄
Hello @euro-bYte, we publish events to your EventBus when we start/stop streaming an attendee's audio to KVS. The "Using Event Bridge notifications" section of the documentation points to the sample events for your reference.
We currently don't have an API to concatenate the chunks from your KVS stream. There's also no provision from KVS to directly write to an S3 bucket, but they do provide sample libraries that you can refer to. Here's one for python: https://github.com/aws-samples/amazon-kinesis-video-streams-consumer-library-for-python
If you're only interested in processing the audio after the meeting ends, you can just listen to the Amazon Chime Media Stream Pipeline Kinesis Video Stream End event and use the GetMediaForFragmentList API. Note that, you need to first get the list of fragments using the ListFragments API. All the required parameters for doing this are present in the event payload we publish.
@avinashmidathada Thank you for the quick answer!
What are you trying to do?
We are looking for a way to additionally capture the audio streams before they are mixed into the main meeting audio channel. This is so that each speaker's audio can be recorded in isolation from other attendees, similar to how Chime is doing this with the live transcription. Is this officially supported by the SDK? If not, is the a place in the source code where we can hook in to grab the individual attendees audio stream manually?
How can the documentation be improved to help your use case?
What documentation have you looked at so far?