alexa / avs-device-sdk

An SDK for commercial device makers to integrate Alexa directly into connected products.
https://developer.amazon.com/alexa/alexa-voice-service
Apache License 2.0
1.26k stars 603 forks source link

Issue with avs-device sdk portaudio callback function #864

Closed HHG123 closed 6 years ago

HHG123 commented 6 years ago

So i have set up the avs-device sdk in my ubuntu laptop. Here is the code for portaudio callback function present in avs-device-sdk

int PortAudioMicrophoneWrapper::PortAudioCallback( const void inputBuffer, void outputBuffer, unsigned long numSamples, const PaStreamCallbackTimeInfo timeInfo, PaStreamCallbackFlags statusFlags, void userData) { PortAudioMicrophoneWrapper wrapper = static_cast<PortAudioMicrophoneWrapper>(userData); ssize_t returnCode = wrapper->m_writer->write(inputBuffer, numSamples); if (returnCode <= 0) { ACSDK_CRITICAL(LX("Failed to write to stream.")); return paAbort; } return paContinue;}

I know that audio samples appear in the memory location pointed by the inputBuffer pointer and numSamples will have number of samples taken at each portaudio callback.

My first doubt is 1.when i give a tone of 1Khz as input to the microphone and then print the inputBuffer values,the first 70 values in the input only seems to be valid eventough numSamples may contain values like 139,133,4..etc at different callbacks.(I plotted in the first 70 samples and it gave me a tone of 1Khz and rest of the samples were junk data)

2.Eventough the samplerate specified is 16Khz,after plotting the samples in the inputBuffer the rate seems to be 8Khz.Why is that?

Second Problem:

I want to replace values in the input buffer by someother values and feed it to the function ssize_t returnCode = wrapper->m_writer->write(inputBuffer, numSamples);

(Othervalues are obtained values from someother programs capturing audio through ALSA libraries and shared to portaudiomicrophonesrapper.c using interprocess communication)

so i replaced the first 70 values in inputBuffer with values i obtain from interprocess communication(IPC). The values obtained are perfect as i plotted the graph.

but for some reason after i modify inputBuffer with these values ALEXA listens but it dosent speak!!!!

Here i am attaching the code for the same

/*

include

include

include <rapidjson/document.h>

include <AVSCommon/Utils/Configuration/ConfigurationNode.h>

include <AVSCommon/Utils/Logger/Logger.h>

include "SampleApp/PortAudioMicrophoneWrapper.h"

include "SampleApp/ConsolePrinter.h"

include <alsa/asoundlib.h>

include

include

include <sys/types.h> //Shared memory libraries

include <sys/ipc.h>

include <sys/shm.h>

define SHMSZ (240000+1)*4 // Shared memory size for buffers

define SHMSZ3 4

int m=0;

//int act_buf = 0; //void* x = NULL;

namespace alexaClientSDK { namespace sampleApp {

using avsCommon::avs::AudioInputStream;

static const int NUM_INPUT_CHANNELS = 1; static const int NUM_OUTPUT_CHANNELS = 0; static const double SAMPLE_RATE = 16000; static const unsigned long PREFERRED_SAMPLES_PER_CALLBACK = paFramesPerBufferUnspecified;

#define NUM_PER_EXCHANGE 70 // added by hari and hameem
#define DELAY_BEFORE_EXCHANGE 50

//int curr_buf;

int shmid1, shmid2, shmid3;
key_t key1 =10, key2 = 20, key3 = 30;
int *shm2,*shm3,*s2=NULL,*s3=NULL;
int m=0,act_buf,t=0;
int *shm1,*s1=NULL; 

void *x1 =  malloc((NUM_PER_EXCHANGE+1)*4);
//int write_mem_point1=0;
//int read_mem_point1=1;

    void *x = x1;

static const std::string SAMPLE_APP_CONFIG_ROOT_KEY("sampleApp1"); static const std::string PORTAUDIO_CONFIG_ROOT_KEY("portAudio"); static const std::string PORTAUDIO_CONFIG_SUGGESTED_LATENCY_KEY("suggestedLatency");

/// String to identify log entries originating from this file. static const std::string TAG("PortAudioMicrophoneWrapper");

/**

std::unique_ptr PortAudioMicrophoneWrapper::create( std::shared_ptr stream) { if (!stream) { ACSDK_CRITICAL(LX("Invalid stream passed to PortAudioMicrophoneWrapper")); return nullptr; } std::unique_ptr portAudioMicrophoneWrapper(new PortAudioMicrophoneWrapper(stream)); if (!portAudioMicrophoneWrapper->initialize()) { ACSDK_CRITICAL(LX("Failed to initialize PortAudioMicrophoneWrapper")); return nullptr; } return portAudioMicrophoneWrapper; }

PortAudioMicrophoneWrapper::PortAudioMicrophoneWrapper(std::shared_ptr stream) : m_audioInputStream{stream}, m_paStream{nullptr} { }

PortAudioMicrophoneWrapper::~PortAudioMicrophoneWrapper() { Pa_StopStream(m_paStream); Pa_CloseStream(m_paStream); Pa_Terminate(); }

bool PortAudioMicrophoneWrapper::initialize() { ConsolePrinter::prettyPrint("mic_initialize"); m_writer = m_audioInputStream->createWriter(AudioInputStream::Writer::Policy::NONBLOCKABLE); if (!m_writer) { ACSDK_CRITICAL(LX("Failed to create stream writer")); return false;

{

 if ((shm2 = (int *)shmat(shmid2, NULL, 0)) == (int *) -1) {
    perror("shmat2");
    exit(1);
}

if ((shm3 = (int *)shmat(shmid3, NULL, 0)) == (int *) -1) {
    perror("shmat3");
    exit(1);
ConsolePrinter::prettyPrint("Shared memory created");
}
//s1=shm1;

}

}
PaError err;
err = Pa_Initialize();
if (err != paNoError) {
    ACSDK_CRITICAL(LX("Failed to initialize PortAudio"));
    return false;
}

PaTime suggestedLatency;
bool latencyInConfig = getConfigSuggestedLatency(suggestedLatency);

if (!latencyInConfig) {
    err = Pa_OpenDefaultStream(
        &m_paStream,
        NUM_INPUT_CHANNELS,
        NUM_OUTPUT_CHANNELS,
        paInt16,
        SAMPLE_RATE,
        PREFERRED_SAMPLES_PER_CALLBACK,
        PortAudioCallback,
        this);
} else {
    ACSDK_INFO(
        LX("PortAudio suggestedLatency has been configured to ").d("Seconds", std::to_string(suggestedLatency)));
     ConsolePrinter::prettyPrint("mic_openstream");
    PaStreamParameters inputParameters;
    std::memset(&inputParameters, 0, sizeof(inputParameters));
    inputParameters.device = Pa_GetDefaultInputDevice();
    //inputParameters.device = "virtmic";
    inputParameters.channelCount = NUM_INPUT_CHANNELS;
    inputParameters.sampleFormat = paInt16;
    inputParameters.suggestedLatency = suggestedLatency;
    inputParameters.hostApiSpecificStreamInfo = nullptr;

    err = Pa_OpenStream(
        &m_paStream,
        &inputParameters,
        nullptr,
        SAMPLE_RATE,
        PREFERRED_SAMPLES_PER_CALLBACK,
        paNoFlag,
        PortAudioCallback,
        this);
}

if (err != paNoError) {
    ACSDK_CRITICAL(LX("Failed to open PortAudio default stream"));
    return false;
}
return true;

}

bool PortAudioMicrophoneWrapper::startStreamingMicrophoneData() { ConsolePrinter::prettyPrint("mic_start"); std::lock_guard lock{m_mutex}; PaError err = Pa_StartStream(m_paStream); if (err != paNoError) { ACSDK_CRITICAL(LX("Failed to start PortAudio stream")); return false; } return true; }

bool PortAudioMicrophoneWrapper::stopStreamingMicrophoneData() { ConsolePrinter::prettyPrint("mic_stop"); std::lock_guard lock{m_mutex}; PaError err = Pa_StopStream(m_paStream); if (err != paNoError) { ACSDK_CRITICAL(LX("Failed to stop PortAudio stream")); return false; } return true; }

int PortAudioMicrophoneWrapper::PortAudioCallback( const void inputBuffer, void outputBuffer, unsigned long numSamples, const PaStreamCallbackTimeInfo timeInfo, PaStreamCallbackFlags statusFlags, void userData) {

 PortAudioMicrophoneWrapper* wrapper = static_cast<PortAudioMicrophoneWrapper*>(userData);

if(m==0) {printf("%s\n","hi"); m++;

    if ((shmid1 = shmget(key1, SHMSZ, 0666)) < 0) {
    perror("shmget1");
    exit(1);
}

if ((shmid2 = shmget(key2, SHMSZ, 0666)) < 0) {
    perror("shmget2");
    exit(1);
}

  if ((shmid3 = shmget(key3, SHMSZ3, 0666)) < 0) {
    perror("shmget3");
    exit(1);
}

if ((shm1 = (int )shmat(shmid1, NULL, 0)) == (int ) -1) { perror("shmat1"); exit(1); }

 if ((shm2 = (int *)shmat(shmid2, NULL, 0)) == (int *) -1) {
    perror("shmat2");
    exit(1);
}

if ((shm3 = (int *)shmat(shmid3, NULL, 0)) == (int *) -1) {
    perror("shmat3");
    exit(1);
ConsolePrinter::prettyPrint("Shared memory created");
}

s1=shm1; s2=shm2; s3=shm3; s2=s2+1;s3=s3+1; }

// m=m+1; // printf("%d\n",m );
// void x1 = malloc((70+1)4);

  //  void *x = x1;

//printf("The value of numsamples = " ); // printf("%lu\n",numSamples );

    for(unsigned long k=0;k<NUM_PER_EXCHANGE;k++)
        {
        //*((int*)x) = *s1;  
         *((int*)inputBuffer+k) = *s1;
         //   *((int*)inputBuffer+k) *= 1;
        //x = ((int*)x)+1;
        s1=s1+2;}

  //  t = t+numSamples;

// x = x1;

// for(unsigned long z=0;z<70;z++) // { // printf("%s","x = "); // printf("%d\n",(((int)inputBuffer)+z)); //printf("%lu\n",sizeof(((int)inputBuffer)));

// }

t=t+140;
//printf("%d\n",t );
if(t>80000)
{
    s1=shm1;
    // ConsolePrinter::prettyPrint("One cycle over");

t=0;

}

// ConsolePrinter::prettyPrint("One cycle over");

// x=x1;

 //for(int w=0;w<NUM_PER_EXCHANGE;w++)
        //{

//       printf("%d\n", *(((int*)inputBuffer)+w)); 
       //   *((int*)inputBuffer+k) = *s1;
         //   *((int*)inputBuffer+k) *= 1;
        //x = ((int*)x)+1;
  //      s1=s1+2;}

 ssize_t returnCode = wrapper->m_writer->write(inputBuffer, NUM_PER_EXCHANGE);

//free(x1);

//printf("%d\n", ((int)wrapper->m_writer));

if (returnCode <= 0) {
    ACSDK_CRITICAL(LX("Failed to write to stream."));
    return paAbort;
}
return paContinue;

}

bool PortAudioMicrophoneWrapper::getConfigSuggestedLatency(PaTime& suggestedLatency) { bool latencyInConfig = false; auto config = avsCommon::utils::configuration::ConfigurationNode::getRoot()[SAMPLE_APP_CONFIG_ROOT_KEY] [PORTAUDIO_CONFIG_ROOT_KEY]; if (config) { latencyInConfig = config.getValue( PORTAUDIO_CONFIG_SUGGESTED_LATENCY_KEY, &suggestedLatency, suggestedLatency, &rapidjson::Value::IsDouble, &rapidjson::Value::GetDouble); }

return latencyInConfig;

}

} // namespace sampleApp } // namespace alexaClientSDK

This is modified code which overwrites the first 70 values in the inputBuffer with new values(perfectly valid) and feeds it into the function ssize_t returnCode = wrapper->m_writer->write(inputBuffer, NUM_PER_EXCHANGE);

but after building this code Alexa listens but doesnt speak.

Also i need more information about why only 70 values in inputBuffer are valid even tough numSamples seems to 139,133 and about the sampling rate and what possibly could i be doing wrong.

BennyAvramson commented 6 years ago

Hi @HHG123,

For the first issue, I suggest you'll consult PortAudio documentation. For the 2nd issue, Alexa doesn't speak after replacing part of the audio, please attach the full log with DEBUG9 enabled.

Thanks, Benny

kclchan commented 6 years ago

I am closing this issue due to inactivity. Please feel free to re-open it if it has been closed in error.