mdn / ai-feedback

Feedback about AI Help
Other
7 stars 5 forks source link

[AI Help] Question: How can I record audio using the LINEAR16 encoding? #26

Open stevehs17 opened 1 year ago

stevehs17 commented 1 year ago

What is the issue with the answer?

The code in the answer defines an options object, in which the mimeType is specified. However, the options object isn't connected to the MediaRecorder object that gets created. The options object should apparently be passed as an argument to the MediaRecorder constructor. This isn't done in the code provided.

How would you categorize this issue?

What did you ask?

  1. How can I record audio using the LINEAR16 encoding?

What did AI Help answer?

To record audio using the LINEAR16 encoding, you can utilize the MediaStream Recording API. This API allows you to easily record audio and/or video streams. When used in conjunction with the MediaRecorder interface, you can record audio from the user's input devices and use the resulting audio in web applications.

Here is an example of how you can use the MediaStream Recording API to record audio:

// Get the audio stream from the user's input device
navigator.mediaDevices.getUserMedia({ audio: true })
  .then(function(stream) {
    // Create a new MediaRecorder instance with the audio stream
    const mediaRecorder = new MediaRecorder(stream);

    // Set the desired audio encoding format to LINEAR16
    const options = { mimeType: 'audio/wav' };

    // Start recording
    mediaRecorder.start();

    // Event listener for when data is available
    mediaRecorder.addEventListener('dataavailable', function(event) {
      // Handle the recorded audio data here
      const audioData = event.data;
      // ...
    });

    // Event listener for when recording is stopped
    mediaRecorder.addEventListener('stop', function() {
      // Handle any cleanup or finalization here
      // ...
    });

    // Stop recording after a certain duration
    setTimeout(function() {
      mediaRecorder.stop();
    }, 5000); // Stop recording after 5 seconds
  })
  .catch(function(error) {
    // Handle any errors that occur during the getUserMedia() call
    console.error('Error accessing user media:', error);
  });

In the above example, we first use the getUserMedia() method to obtain the audio stream from the user's input device. Then, we create a new MediaRecorder instance with the audio stream. We set the desired audio encoding format to LINEAR16 by specifying the MIME type as 'audio/wav'. Finally, we start recording and handle the recorded audio data in the dataavailable event listener.

Please note that the specific encoding options for LINEAR16 may vary depending on the codec and tools you are using for encoding. The documentation does not provide specific details on LINEAR16 encoding parameters.

What MDN pages did AI Help consult?

caugner commented 1 year ago

Thank you for flagging this issue. ๐Ÿ‘

It looks like AI Help gave an overall helpful answer, but which contained an error in the given code example.

We're looking into improvements to avoid this issue.

stevehs17 commented 12 months ago

Thank you for your response, Claas.

In my opinion, the answer AI Help provided was not helpful. That's why I checked the box for "The answer is not helpful." In fact, there were so many problems with the answer that it would take much longer to try to separate fact from fiction in the answer than to look for a non-AI generated answer, which is what I ended up doing.

I've had the same kinds of problems trying to use ChatGPT to answer these kinds of questions, and gave up on ChatGPT a while back because using it was more trouble than it was worth. It's much more efficient to use StackOverflow, which provides much better answers, even if I have to separate them out from ones that aren't very good.

If AI Help relies entirely on LLMs, and doesn't also make use of model-based AI, I'm skeptical that there will ever be any significant improvement in the quality of the answers AI Help provides. (For criticism of the view that LLMs are sufficient to avoid this sort of issue, see the writings of Gary Marcus, like his substack.)

I'm curious why you think AI Help "gave an overall helpful answer," and what sort of improvements to it you think will avoid this issue.

caugner commented 11 months ago

@stevehs17 Thanks again for reporting this issue. ๐Ÿ‘

As we're working on an iteration, I would like to share with you what AI Help 2.0 might answer.

Can you please check how the following answer performs with regard to the issue you reported above? ๐Ÿ™

To record audio using the LINEAR16 encoding (also known as PCM encoding), you can use the MediaRecorder API in conjunction with the getUserMedia API to capture audio from the user's microphone. However, the MediaRecorder API does not allow you to specify the audio encoding directly. Instead, it uses the default encoding provided by the browser, which is typically audio/webm for Chrome and Firefox.

If you need to ensure that the audio is recorded in the LINEAR16 format, you would have to record the audio using the default encoding first, and then convert it to LINEAR16 using JavaScript or server-side processing.

Here's a basic example of how to record audio using the MediaRecorder API:

// Check if getUserMedia is supported
if (navigator.mediaDevices && navigator.mediaDevices.getUserMedia) {
  // Request the audio stream from the user's microphone
  navigator.mediaDevices.getUserMedia({ audio: true })
    .then(function(stream) {
      // Create a new MediaRecorder instance
      const mediaRecorder = new MediaRecorder(stream);

      // Array to store the audio data chunks
      let chunks = [];

      // When audio data is available, push it to the chunks array
      mediaRecorder.ondataavailable = function(event) {
        chunks.push(event.data);
      };

      // When recording stops, process the audio data
      mediaRecorder.onstop = function() {
        // Create a Blob from the chunks array
        const blob = new Blob(chunks, { type: 'audio/webm' });
        chunks = [];

        // TODO: Convert the Blob to LINEAR16 format
        // This step requires additional processing, which is not directly supported by the MediaRecorder API.
        // You may need to use a library like libwebm.js or send the Blob to a server for conversion.
      };

      // Start recording
      mediaRecorder.start();

      // Stop recording after 5 seconds
      setTimeout(function() {
        mediaRecorder.stop();
      }, 5000);
    })
    .catch(function(err) {
      console.error('The following getUserMedia error occurred:', err);
    });
} else {
  console.error('getUserMedia not supported on your browser!');
}

In the example above, the onstop event handler is where you would need to implement the conversion to LINEAR16. Since this conversion is not trivial and cannot be done directly within the MediaRecorder API, you might need to use additional libraries like libwebm.js for client-side conversion or send the audio data to a server that can handle the conversion process.

For server-side conversion, you could use tools like ffmpeg to convert the audio to the desired format. If you choose to go this route, you would upload the recorded audio Blob to the server, and then use ffmpeg to convert it to LINEAR16 PCM.

Please note that the conversion process is beyond the scope of the MediaRecorder API and would require additional implementation effort.

stevehs17 commented 10 months ago

Thank you for getting back to me, Claas. If the code is correct, I would prefer this response to the one I originally received. (I am not in a position to evaluate the code's correctness; that would need to be done by someone well-versed in Javascript front-end programming.) That said, the code would be much more useful if my question were about how to record audio rather than how to record audio in a specific format, LINEAR16. I think this Stackoverflow post and this code sample are more along the lines of what I was looking for.

On a related note, I am finding that ChatGPT (I'm using 3.5) works much better for very basic questions than for more esoteric ones. For instance, when I asked it to modify a shell script to find the elapsed time between the beginning and the end of the script, it generated code that worked. However, when I asked it to find the elapsed time in milliseconds, the code it generated didn't work. I know now not to ask ChatGPT questions that are too difficult, and to rely on it only for very basic things. I would find ChatGPT more useful if it could distinguish cases where the answer is likely to be correct from cases where that's less likely. In fact, I'd prefer that it say it doesn't know the answer to a question than to give me an incorrect response, or to give the probability that an answer is correct along with the answer.

Similarly, I would expect your AI to do a better job of answering the question, How can I record audio (in a web browser)?, than the question, How can I record audio (in a web browser) using the LINEAR16 encoding?. If you are able to determine that the answer it would give to the second question has a substantially lower probability of being correct than the answer it would give to the first question, I would prefer that it not even attempt to answer the second question, and that it say that it can't answer it. That would let me know that I need to look elsewhere, rather than spending time following up on an answer that's unlikely to be correct. I was dissatisfied with the original answer to my question about recording in LINEAR16 since the answer was so rife with errors that it took up considerable time sorting that out. In that case, I would have saved time by never asking that question of the AI in the first place.