IBM / watson-banking-chatbot

A chatbot for banking that uses the Watson Assistant, Discovery, Natural Language Understanding and Tone Analyzer services.
https://developer.ibm.com/patterns/create-cognitive-banking-chatbot/
Apache License 2.0
264 stars 375 forks source link

How can I implement speech to text ? #73

Closed DanielGTC closed 5 years ago

DanielGTC commented 6 years ago

I am using chrome webkit for speech recognition since it works much better for my purposes. But how can I implement this tool in the right way ? My text does not get analysed by watson by doing it like this...

 if(finalTranscripts.includes('?')){
           API.sendRequest(finalTranscripts, this);

          }

Can someone help me out ?

ptrikkur commented 6 years ago

Daniel, can you mail me some more details on your code.

DanielGTC commented 6 years ago

Hi Ptrikkur, thanks for your reply!

This is my index.html file. I changed it for my purposes and added chrome webkit for speech recognition. The important stuff is between line 96 and 162. I assume that I need to do something with "conversationPanel" which is defined at conversation.js but I am not sure how or what. I've tried already many many things but I just can't make it right. Speech recognition is picking up my voice very well but I can't send this text inside of the chatbox and make it processed by the watson api.

I'm dealing already since a longer time with this problem. I am happy for any kind of help!

<!--
 * Copyright 2017 IBM Corp. All Rights Reserved.
 *
 * Licensed under the Apache License, Version 2.0 (the 'License'); you may not
 * use this file except in compliance with the License. You may obtain a copy of
 * the License at
 *
 * http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an 'AS IS' BASIS, WITHOUT
 * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
 * License for the specific language governing permissions and limitations under
 * the License.
-->
<html>
<head>
  <base href="/">
  <title>Open Financial Cognitive Conversation</title>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <meta http-equiv="X-UA-Compatible" content="IE=edge">
  <meta property="og:image" content="conversation.svg" />
  <meta property="og:title" content="Conversation Chat Simple" />
  <meta property="og:description" content="Sample application that shows how to use the Conversation API to identify user intents" />
  <link rel="shortcut icon" href="favicon.ico" type="image/x-icon">
  <link href="https://netdna.bootstrapcdn.com/bootstrap/3.1.0/css/bootstrap.min.css"  rel="stylesheet">
  <link rel="stylesheet" href="css/app.css">
  <style type="text/css">

.stepwizard-step p {
    margin-top: 10px;
}

.stepwizard-row {
    display: table-row;
}

.stepwizard {
    display: table;
    width: 100%;
    position: relative;
}

.stepwizard-step button[disabled] {
    opacity: 1 !important;
    filter: alpha(opacity = 100) !important;
}

.stepwizard-step {
    display: table-cell;
    text-align: center;
    position: relative;
}

.btn-circle {
    width: 30px;
    height: 30px;
    text-align: center;
    padding: 6px 0;
    font-size: 12px;
    line-height: 1.428571429;
    border-radius: 15px;
}

.modal-header {
    padding-bottom: 5px;
}

.nextBtn {
    padding: 10px 16px;
    font-size: 18px;
}

</style>
<script src="https://code.jquery.com/jquery-1.10.2.min.js"></script>
<script src="https://netdna.bootstrapcdn.com/bootstrap/3.1.0/js/bootstrap.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/jspdf/1.3.2/jspdf.debug.js"></script>
</head>

  <body class="tutorial" onLoad="initPage()">

<body>
  <div id="contentParent" class="responsive-columns-wrapper">
    <div id="chat-column-holder" class="responsive-column content-column">
      <div class="chat-column">
        <div id="scrollingChat"></div>

        <!--  <label for="textInput" class="inputOutline">
          <input id="textInput" class="input responsive-column"
            placeholder="Type something" type="text" value=""
            onkeydown="/*globals CanvasJS */
            ConversationPanel.inputKeyDown(event, this)">
        </label> -->
<label for="textInput" class="inputOutline">
<div id="textInput" class="input responsive-column" type="text" placeholder="Type something" ></div></label>

<button onclick="startConverting();"><i class="fa fa-microphone"></button>
<script type="text/javascript">

    var r = document.getElementById('textInput');
    function startConverting (){

        if('webkitSpeechRecognition' in window){
    var speechRecognizer = new webkitSpeechRecognition();
    speechRecognizer.continuous = true;
    speechRecognizer.interimResults = true;
    speechRecognizer.lang = 'en-IN';
    speechRecognizer.start();

    var finalTranscripts = '';

    speechRecognizer.onresult = function(event){
          var interimTranscripts = '';
          for(var i = event.resultIndex; i < event.results.length; i++){
            var transcript = event.results[i][0].transcript; 
            transcript.replace("\n", "<br>");
            if(event.results[i].isFinal){
                finalTranscripts += transcript;
            }else{
                interimTranscripts += transcript;
            }}

 if(finalTranscripts.includes('two')){
           window.alert('It works');

          }

 if(finalTranscripts.includes('ask')){
           Api.sendRequest(finalTranscripts, this);

          }         

          r.innerHTML = finalTranscripts + '<span style="color#999' +interimTranscripts + '</span>'; 
    };

    speechRecognizer.onerror = function (event) {

    };

}else {
    r.innerHTML = 'please update google chrome ';
}

    }

</script>

        <!--        <button id="speak-btn">Speak</button>-->
        <audio autoPlay="true" id="audio"
               className="audio" 
               controls="controls">
              Your browser does not support the audio element.
        </audio>

      </div>
    </div>
</div>

<!-- line modal -->
<div class="modal fade" id="squarespaceModal" tabindex="-1" role="dialog" aria-labelledby="modalLabel" aria-hidden="true">
  <div class="modal-dialog">
    <div class="modal-content">
        <div class="modal-header">
            <button type="button" class="close" data-dismiss="modal"><span aria-hidden="true">×</span><span class="sr-only">Close</span></button>
            <h3 class="modal-title" id="lineModalLabel">Home Loan Application</h3>
        </div>
        <div class="modal-body">

            <!-- content goes here -->
            <div class="stepwizard">
                <div class="stepwizard-row setup-panel">
                    <div class="stepwizard-step">
                        <a href="#step-1" type="button" class="btn btn-primary btn-circle">1</a>
                        <p>Applicant Details</p>
                    </div>
                    <div class="stepwizard-step">
                        <a href="#step-2" type="button" class="btn btn-default btn-circle" disabled="disabled">2</a>
                        <p>Loan Details</p>
                    </div>
                    <div class="stepwizard-step">
                        <a href="#step-3" type="button" class="btn btn-default btn-circle" disabled="disabled">3</a>
                        <p>Aadhaar eKYC</p>
                    </div>
                    <div class="stepwizard-step">
                        <a href="#step-4" type="button" class="btn btn-default btn-circle" disabled="disabled">4</a>
                        <p>eSign Application form</p>
                    </div>
                </div>
            </div>
            <form role="form" id="form1">
            <div class="row setup-content" id="step-1">
                <div class="col-xs-12">
                    <div class="col-md-12">
                        <h3>Applicant Details</h3>
                        <div class="form-group">
                            <label class="control-label">Full Name</label>
                            <input maxlength="50" type="text" id="nameInput" required="required" class="form-control" placeholder="Enter Full Name" />
                        </div>
                        <div class="form-group">
                            <label class="control-label">PAN number (10 character alphanumeric)</label>
                            <input maxlength="10" minlength="10" type="text" required="required" class="form-control" placeholder="Enter PAN number" pattern="[a-zA-Z0-9]+"/>
                        </div>
                        <div class="form-group">
                            <label class="control-label">Aadhaar number (12 digit)</label>
                            <input min="100000000000" max="999999999999" type="number" id="aadharInput" required="required" class="form-control" placeholder="Enter Aadhaar number" />
                        </div>
                        <div class="form-group">
                            <label class="control-label">Mobile number (10 digit)</label>
                            <input min="1000000000" max="9999999999" type="number" id="phoneInput" required="required" class="form-control" placeholder="Enter Mobile number" />
                        </div>
                        <button class="btn btn-primary nextBtn btn-lg pull-right" type="button">Next</button>
                    </div>
                </div>
            </div>
            <div class="row setup-content" id="step-2">
                <div class="col-xs-12">
                    <div class="col-md-12">
                        <h3>Loan Details</h3>
                        <div class="form-group">
                            <label class="control-label">Loan Amount (in the range 10 to 40 lakhs)</label>
                            <input min="1000000" max="4000000" type="number" required="required" class="form-control" placeholder="Amount in Rs" />
                        </div>
                        <div class="form-group">
                            <label class="control-label">Loan tenure (in the range 5 to 20 years)</label>
                            <input min="5" max="20" type="number" required="required" class="form-control" placeholder="Tenure in years" />
                        </div>
                        <div class="form-group">
                            <label class="control-label">Interest rate</label>
                            <input maxlength="100" type="text" class="form-control" placeholder="8.25%" disabled="disabled"/>
                        </div>
                        <button class="btn btn-primary nextBtn btn-lg pull-right" type="button">Next</button>
                    </div>
                </div>
            </div>
            <div class="row setup-content" id="step-3">
                <div class="col-xs-12">
                    <div class="col-md-12">
                        <h3>Aadhaar eKYC</h3>
                        <div class="form-group">
                            <label class="control-label">Aadhaar No.</label>
                            <input type="text" class="form-control aadhaarBox" placeholder="Aadhaar No." readonly="readonly"/>
                            <button class="btn btn-primary" type="button" onclick="eKYCOTP()">Get OTP</button>
                        </div>
                        <div class="form-group">
                            <label class="control-label">OTP (6 digit)</label>
                            <input min="100000" max="999999" type="number" class="form-control" id="ekycOtp" placeholder="OTP" required="required" readonly="readonly"/>
                        </div>
                        <hr>
                        <label class="control-label">CONSENT FOR AADHAAR AUTHENTICATION</label>
                        <div class="checkbox">
                          <label><input type="checkbox" required="required">I am the holder of above Aadhaar Number. I hereby agree to authenticate myself using Aadhaar through Open Financial Bank, and provide my consent to collect my Aadhaar and OTP, to retrieve my details along with my email ID / mobile number (if available) from UIDAI. I have understood Open Financial Bank's declaration that, my identity information will only be used for subscribing the service. I have understood that, my biometrics / OTP is encrypted and will not be stored / shared and will be submitted to UIDAI (CIDR) only for the purpose of this transaction.</label>
                        </div>
                        <button class="btn btn-primary nextBtn btn-lg pull-right" type="button">Next</button>
                    </div>
                </div>
            </div>
            <div class="row setup-content" id="step-4">
                <div class="col-xs-12">
                    <div class="col-md-12">
                        <h3>Review Application form and eSign</h3>
                        <object type="application/pdf" data="" width="98%" height="500px" id="embedPdf" download='FileName'>No Support
                        </object>
                        <hr>
                        <div class="form-group">
                            <label class="control-label">Aadhaar No.</label>
                            <input type="text" class="form-control aadhaarBox" placeholder="Aadhaar No." readonly="readonly"/>
                            <button class="btn btn-primary" type="button" onclick="eSignOTP()">Get OTP</button>
                        </div>
                        <div class="form-group">
                            <label class="control-label">OTP (6 digit)</label>
                            <input min="100000" max="999999" type="number" class="form-control" id="esignOtp" placeholder="OTP" required="required" readonly="readonly"/>
                        </div>
                        <hr>
                        <label class="control-label">CONSENT FOR AADHAAR AUTHENTICATION</label>
                        <div class="checkbox">
                          <label><input type="checkbox" required="required">I am the holder of above Aadhaar Number. I hereby agree to authenticate myself using Aadhaar through Open Financial Bank, and provide my consent to collect my Aadhaar and OTP, to retrieve my details along with my email ID / mobile number (if available) from UIDAI. I have understood Open Financial Bank's declaration that, my identity information will only be used for subscribing the service. I have understood that, my biometrics / OTP is encrypted and will not be stored / shared and will be submitted to UIDAI (CIDR) only for the purpose of this transaction.</label>
                        </div>

                        <button class="btn btn-success btn-lg pull-right" id="finalBtn" type="button">eSign</button>
                    </div>
                </div>
            </div>
        </form>

        </div>
    </div>
  </div>
</div>

<script
  src="https://code.jquery.com/jquery-3.1.1.min.js"
  integrity="sha256-hVVnYaiADRTO2PzUGmuLJr8BLUSjGIZsDYGmIJLv2b8="
  crossorigin="anonymous"></script>
  <script src="js/jquery-ajax-native.js"></script>

  <script src="js/modal.js"></script>
  <script src="js/common.js"></script>
  <script src="js/api.js"></script>
  <script src="js/conversation.js"></script>
  <script src="js/global.js"></script>
  <script src="js/tts-custom.js"></script>
  <script src="js/z2c-speech.js"></script>
  <script src="js/watson-speech.js"></script>
    <script src="js/jquery-3.1.0.min.js"></script>

</body>
</html>
ptrikkur commented 6 years ago

Are you having trouble with the Conversation Service or Speech Recognition ?

DanielGTC commented 6 years ago

Speech recognition works fine. This is how the standard program works with normal text input via keyboard

<!--  <label for="textInput" class="inputOutline">
          <input id="textInput" class="input responsive-column"
            placeholder="Type something" type="text" value=""
            onkeydown="/*globals CanvasJS */
            ConversationPanel.inputKeyDown(event, this)">
        </label> -->

I type something, press enter and the text goes to the chatbox and gets analysed by watson.

Now I am trying to do the same thing with speech recognition. On the word "ask" it should send the text over to the chatbox and the text should get analysed by watson. Exacly like before just with exchanging the "type text and press enter for send" method to "speech and say "ask" for send" method. Like I said before, speech recognition works fine, just the handling of the text by saying the word "ask" does not work.

ptrikkur commented 6 years ago

Is it a specific requirement to add 'ask' to indicate end of the sentence?

DanielGTC commented 6 years ago

No, its not. It could be a pause or something else as well.

ptrikkur commented 6 years ago

Daniel - It is not possible to detect 'ask' until the audio is recognized and the text goes into the conversation service. Here is what I do with Watson STT and you will be able to find a similar solution with webkitSpeechRecognition. 1) I use a Websocket connection to send the audio to the server. 2) I use the onMessage event to receive the transcript (partial or interim). There is no need to wait for an End of Sentence word like 'ask'. It would be cumbersome for the users too and prone to error. 3) Send the message to the conversation service to be analyzed.

DanielGTC commented 6 years ago

Great, thank you thats already a lot of help. Would you minde to show me some of your code in regards to this issue ? Especially the 3th point gives me a hard time.

markstur commented 5 years ago

Closing this. Looks like some questions were answered but this is not a STT pattern so let's not leave this open forever. Maybe refer to https://www.ibm.com/watson/services/speech-to-text/