RoboTutorLLC / RoboTutor_2019

Main code for RoboTutor. Uploaded 11/20/2018 to XPRIZE from RoboTutorLLC/RoboTutor.
Other
7 stars 4 forks source link

1.8.9.1 record entire session audio #299

Open JackMostow opened 6 years ago

JackMostow commented 6 years ago

To capture audio during oral reading we need to record it in RoboTutor itself rather than in AZ.

  1. Provide an easy way to enable or disable recording by a given .apk -- imaybe in an .ini file in the top folder?
  2. For now, start recording at session start and end recording at session end. We might later decide to save space by recording only each ASR activity, or even each sentence.
  3. Use the same filename as the session logs, but with audio extension.
  4. Record in the same audio quality used for ASR input and narrations, at least for .wav files -- 16Hz 16-bit sampling. I
  5. Record as .mp3 because it's half the size of .wav.
  6. If feasible, include meta-information such as tablet id, session id, tutor id, duration, anything useful.
JackMostow commented 6 years ago

@shivenmian - @octavpo found this task harder than expected. You have more experience with audio in Android Studio, so you're better equipped than I am to understand the problem and know possible solutions. Can you discuss it with him to see if you can give him useful advice for how to solve it quickly and robustly?

octavpo commented 6 years ago

I finally have code that provides a solution to this issue, it's committed as the 'audio_logging' branch. Besides all kind of issues related to my unfamiliarity with the code and the Android environment, the main problem was to find a solution to provide the audio reliably to the speech recognizer while continuing to record it to the log at the same time. After a few false attempts I ended up using a large circular buffer, to which the logging stores all audio it reads from the recorder, and from which the speech recognizer takes audio input as if it came directly from the recorder. In my not so extensive testing it seems to be working fine, but it definitely needs more testing.

JackMostow commented 6 years ago

Good!

  1. How much latency does this solution add to the time from when an utterance begins to when the ASR starts to recognize it? Presumably negligible, but if the buffer is quantized into frames of some duration, it would have to receive a completed frame before passing it to the ASR.
  2. How large a buffer, as measured by the duration of input audio it can store?
  3. The ASR should never fall behind real-time, but if it does, which speech does the buffer retain: a. What it already contains, i.e. it discards subsequent speech until the ASR catches up b. The newest speech, i.e. it discards speech in the buffer to make room for new input c. Should we even care which, especially if the situation should never arise anyway? Thanks. - Jack
octavpo commented 6 years ago
  1. The latency should be negligible, the only extra steps are that one thread is writing to the buffer and the other one is reading from it instead of reading directly from the recorder. It could be that the data is already in the buffer by the time the speech recognizer is ready to read, since the logging adds new data continuously, in 1/10 s increments. There are no frames, it just writes the data as it gets it.
  2. Right now the buffer is 1s, which seemed to work fine in my testing, but we can easily make it larger if needed.
  3. Yes we should care which, that was the main problem, that I had to do it so that the logging never discards the latest audio signal, only the oldest one, as otherwise the speech recognizer was losing data occasionally. That's what the circular buffer does, it basically implements a FIFO audio buffer, so the latest 1s is always available. And like I said we can increase that.
JackMostow commented 6 years ago

@octavpo -

  1. Please test running recording-enabled RoboTutor together with audio-DISabled AZ Screen Recorder.
  2. If necessary it's ok to generate an apk for recording-enabled RoboTutor, but eventually I'd prefer to use a small separate configuration file for all such settings so that we don't need multiple apks except for truly different versions of the source code.
octavpo commented 6 years ago

I have no idea how to run this audio-DISabled AZ Screen Recorder. Who does, where can I find it?

JackMostow commented 6 years ago
  1. Get AZ Screen Recorder from the Google App store. The andrewcarnegie account already paid for it.

  2. Install and launch AZ Screen Recorder on an Android.

  3. In its settings, turn off audio recording. It's probably turned off already. It claims that audio recording is no longer allowed, but it works when turned on.

  4. Start recording by tapping the camcorder icon.

octavpo commented 6 years ago

I've tested it, it's working fine.

JackMostow commented 6 years ago

@octavpo - Great! Please implement #311 ASAP so as to turn session audio recording or or off in a configuration file. @kevindeland - Session audio recording is a capability I want to include in the upcoming experimental deployment.

JackMostow commented 6 years ago

@octavpo and @kevindeland - To get screen capture video in the beta test without disabling ASR, we need to turn off audio capture in AZ Screen Recorder and record silent screen video. If we want audio, we need to enable the session audio recording that @octavpo implemented.

@octavpo -

  1. Does the apk you just posted with word-skipping include session audio recording?
  2. If so, please send a config file that enables it, or is it enabled by default?
  3. If not, did you (long ago) do a pull request for session audio recording?

@kevindeland - Did you fulfill that pull request? If so, do you have a config file that enables screen audio recording?

Thanks. - Jack

octavpo commented 6 years ago

Yes the apk does include my session audio recording code. But like I just said in the other issue it looks like it was disabled, so I reenabled it. Right now I don't think it can be enabled through a config file, an least I haven't worked on it, not sure if anybody else did that.

If we want that, we need to only control saving the signal to a file, not the recording itself, as otherwise speech recognition doesn't have a signal anymore. Does this have higher priority than the remaining issues in #348?

JackMostow commented 6 years ago

@octavpo (and @kevindeland) - In order to audio-enable screen recording, we omitted story.read in order to QA the new activities. But we can't pull that trick when we beta-test RoboTutor in its entirety, and then we'll need session audio.

  1. Why did the recorded session audio I listened to sound so sped-up?
  2. (How) can it be played back to sync with screen video recordings?
octavpo commented 6 years ago

As I was saying in a response last week, I haven't seen this speed-up in my testing on my branch. I could try again with the current development branch if I can get a current config file. Might be something specific to your tablet, could you also try on a different one? Or maybe after restarting it?

If it's still happening and we can't find the cause, I guess the audio could be slowed down if we could figure out the speed-up rate. I see Audacity does have a facility to vary the playback speed.

JackMostow commented 6 years ago

@octavpo -- They're in QA - Alternative config files under CodeDrop2 QA.

Yes, please clone the current development branch as your starting point. It's 452 commits ahead of master. Your audio_targets branch (if that's where you implemented session audio recording) is only 22 commits ahead of master. I believe the development branch has been serving as de facto master.

@kevindeland -

  1. Is this advice correct?
  2. Did you mention changing the recording to save space? Using what method, where in the code?

Thanks. - Jack

octavpo commented 6 years ago

I've retested with the current environment branch and there's still no speed-up on my tablet. The tests last week were done on the reading_modes branch, not the audio_targets branch, so they had also been updated to development, but had my extra code included. It seems it doesn't make a difference, as expected.

JackMostow commented 6 years ago

@kevindeland - Do session audio recordings on your tablet sound sped-up? - Jack

JackMostow commented 6 years ago

@octavpo - Kevin hears the same problem I do, so the problem is not tablet-specific. Please post a brief session audio recording so we can listen to it.

octavpo commented 6 years ago

I put a wav file in the Downloads from any team member folder.

JackMostow commented 6 years ago

@octavpo - I agree that RoboTutor_debug.release_dbg_000002_2018.10.12.20.52.48_6113001086.wav sounds fine. I'm not sure why it has breath noise. Does it record mic input whether ASR is running or not?

@kevindeland - It sounds fine both in whatever the default audio app is on my PC, and in Music Player for Google Drive. In order to test it properly, we need:

  1. To turn on the feature. Can you add the switch for it to the config file?
  2. To run it with ASR running, to make sure that's not what's causing the speedup. But we got rid of the READ activity so that we could enable audio recording in AZ Screen Recorder. What's the easiest way to bring it back? I can test an apk but not with Android Studio.
JackMostow commented 6 years ago

@judithodili - It's great to see screen videos coming in from Bagamoyo for the beta test -- but why are they silent? I suppose it's ok because;

  1. They're still informative.
  2. We'll need audio recording disabled in AZ Screen Recorder for the integrated-matrix version with ASR.
JackMostow commented 6 years ago

@octavpo - Still puzzling why your recording is good and ours weren't. Did you run your apk from Android Studio? @kevindeland - I was going to ask Octav to record using the apk now being used at our beta sites -- but I realized that without the config switch for audio recording, he can't.

octavpo commented 6 years ago

I'm not sure why it has breath noise. Does it record mic input whether ASR is running or not?

Yes it's always recording from the mic.

Still puzzling why your recording is good and ours weren't. Did you run your apk from Android Studio?

Yes, but last week I also ran an installed apk and I was getting the same result. And I think you were getting the speed-up when you ran it. I could try your test apk. I'd also need your config file.

JackMostow commented 6 years ago

The apk and two config files (with and without the debugger menu) are at Share with VMC - CD2, which I just gave you read access to. However, until the apk is modified to control recording via the config file, I don't see how you can get it to record. @kevindeland, please take note.

JackMostow commented 6 years ago

@octavpo - AZ Screen Recorder gave the error "Cannot record sound. Microphone is busy." shown in [Bagamoyo Beta Testing] (https://docs.google.com/document/d/1_xqxg2iDaCzyxdKnf14Nj8e1kDIYGkYIIxm99ZhECpc/edit?ts=5bc63306) when Fortunatus tried to enable its audio recording. Would your implementation of session audio recording cause this error because it routes audio input even when not using it? @judithodili and @kevindeland - I assume the answer is yes, in which case even though the current beta omits all activities that use ASR, screen video must be silent. This was our plan anyway, except that we planned to use RoboTutor's session audio recording, which so far is working for Octav but captures sped-up audio for us, for reasons that will be hard to diagnose until Kevin implements a configuration variable to record whether session audio is saved to a file.

octavpo commented 6 years ago

No my session audio recording shouldn't be an issue when not in use. Not sure why you're thinking it's not currently in use though. I don't know what configuration Kevin has deployed for site testing, but if it's exactly what's on the development branch, I see audio recording is enabled. So the sites should be recording audio and storing it to files, which would explain why AZ Screen Recorder can't do it. I'll try later with the apk you've provided above, I'm on a plane now.

Maybe the confusion is between session audio recording and ASR. If you remember our discussion long time ago when I was working on the feature, the two are independent but share the same audio signal, which is provided by the session audio recording. So we can have audio recording even if you're not using ASR. But we can't use ASR if audio recording is turned off, because it wouldn't get any audio signal. However we could disable saving the audio signal to a file if we want to use ASR but not clog servers with our audio files. So we might need two different config parameters. The only reason to disable audio recording would be to let an outside program to record it, like the AZ Screen Recorder.

JackMostow commented 6 years ago

@octavpo and @kevindeland -- By "recording" in RoboTutor, I mean saving to a file the audio signal that it is already capturing to make available for ASR when needed. Android lets only one apk control the mic, which RoboTutor is doing even when not saving audio to a file. We just need a config file variable that controls whether to save the audio to a file, because we can't always afford to do so. We don't (at present) need one to control whether to capture audio in the first place.

octavpo commented 6 years ago

I have run the apk from Share with VMC - CD2 with the accompanying libraries. The apk does record audio and I still don't get any speed-up, see the new wave file in the Download area.

JackMostow commented 6 years ago

@octavpo - I didn't realize that 2.6.3.1 writes audio files to the "RoboTutor Audio" folder, initially as .raw, then when RoboTutor exits normally, converted to .wav (presumably by prepending header information).

  1. @judithodili - This means that the current deployed version is recording audio at both beta sites.

    a. If it's good, we may want it.

    b. To tell if it's good (I suspect not), I'll ask Fortunatus to upload one.

    c. Either way, they're taking up space on the tablets.

  2. I just compared one of yours with one of mine:

    a. Yours sounds OK. It's at RoboTutor_release.release_sw_000003_2018.10.16.23.11.11_6113001086.wav.

    b. Mine sounds sped-up. It's at RoboTutor_release.release_dbg_000037_2018.10.17.09.25.44_6105001158.wav.

  3. Both have Bit rate 256kbps as their only Audio information under file properties.

    a. Yours (RoboTutor_release.release_sw_000003_2018.10.16.23.11.11_6113001086.wav) starts: "RIFF¤ WAVEfmt    €> }   data€"

    b. Mine (RoboTutor_release.release_dbg_000037_2018.10.17.09.25.44_6105001158.wav) starts: "RIFF¤«$ WAVEfmt    €> }   data€"

    c. Their headers start identically except for the "«$" in mine. To test whether this difference matters, I replaced "RIFF¤«$" with "RIFF¤" in mine, but it still sounded sped-up.

    d. I can't tell if they have the same encoding (mu law vs. linear), but their bodies look very similar.

  4. Both were recorded on Pixel tablet mode C1502W, according to Who has which tablet(s):

    a. 2018-07-19 | Octav | Kevin Willows XPRIZE loaner (+ keyboard) |   |   | silver 64B Google Pixel C1502W + keyboard | SKU GA3A00219-A24 | 6113001086 | Android 6.0.1

    b. 2018-07-19 | Jack (previously Iris) | Jack Pixel |   |   | C15O2W 32GB Google Pixel C + keyboard |   | 6105001158 | Android 6.0.1

  5. Mine now runs Android 7.1.2; does yours?

All - Besides asking Fortunatus for a sample, what should we try next to resolve the mystery?

octavpo commented 6 years ago

The different word in the header is the file length, so it doesn't matter.

My tablet runs Android 8.1.0. I doubt that's the issue, but it's possible.

I was suggesting before that you could try to restart the tablet, or to reset it to factory settings.

Or maybe make sure you don't have any extra zips in the Download folder besides what's in the CD2 folder and reinstall Robotutor.

JackMostow commented 6 years ago

@octavpo - Thanks, but the fact that both Kevin and I encountered this problem, along with the fact that restarting didn't fix the problem, and the absence of a plausible reason to expect any of the other steps to work, suggests that they would just waste time. Do you have a 7.1.2 tablet you can test on? If so, please do ASAP.

@kevindeland - Do you have an Android 8 tablet you can test on? I don't think we should switch to Android 8, but at least it would narrow down the problem.

Both - Any other ideas?

octavpo commented 6 years ago

No I don't have a 7.1.2 tablet. Even if I get one I'd still need to do the steps above. It's quite probable there's some setting on your guys tablets that makes it do that, which could come from running some app or some older version of Robotutor that I haven't run. Or something in some extra zip file that I'm not loading.

JackMostow commented 6 years ago

@judithodili - URGENT BUT QUICK: Please have all your testers with tablets take ~1 minute ASAP to:

  1. Find the "RoboTutor Audio" folder on the tablet.
  2. Find a .wav file in it.
  3. Listen to enough of it to tell whether it sounds sped-up or normal. It should be apparent immediately.
  4. Tell us which, and which tablet(s) you checked.

Thanks! - Jack