zugbug007 / opentx

Automatically exported from code.google.com/p/opentx
0 stars 0 forks source link

WAV file processing not properly skipping TIFF sections #192

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Which board (stock / gruvin9x / sky9x / Taranis) are you using?  Taranis

What is your openTx FW version? opentx-r2834

What is your openTx EEPROM version? 215

What steps will reproduce the problem?
1.  Use the say command on Mac OS X to create a wave file.  For example:
say -o ~/lei1632.wav --data-format=LEI16@32000 --channels=1 -v Tom "Changed to: 
Loiter Mode."

2. Try to use the generated sound file - notice that it does not play

3. Examine the RIFF structure of the WAV file (I use RIFFPad to view them) and 
notice that there is a "FLLR" section between the "fmt " and "data" sections.  
This should be skipped.

What is the expected output? What do you see instead?  I expect the sound to 
play.  No sound is generated because the function int AudioQueue::mixWav() 
exits when a section other than "data" is encountered.  It should skip that 
section.

Please provide any additional information below.

Apple software often creates WAVE files with a non-standard (but "spec" 
conformant) "FLLR" subchunk after the "fmt " subchunk and before the "data" 
subchunk. I assume "FLLR" stands for "filler", and  the purpose of the subchunk 
is to enable some sort of data alignment optimization. The subchunk is usually 
about 4000 bytes long, but its actual length can vary depending on the length 
of the data preceding it.

Adding arbitrary subchunks to WAVE files is generally considered 
spec-conformant because WAVE is a subset of RIFF, and the common practice in 
RIFF file processing is to ignore chunks and subchunks which have an 
unrecognized identifier. The identifier "FLLR" is "non-standard" and so should 
be ignored by any software which encounters it.

There is a fair amount of software out there that treats the WAVE format much 
more rigidly than it ought to, and I suspect the library you're using may be 
one of those pieces of software. For example, I have seen software that assumes 
that the audio bytes always begin at offset 44 -- this is an incorrect 
assumption.

In fact, finding the audio bytes in a WAVE file must be done by finding the 
location and size of the "data" subchunk within the RIFF; this is the correct 
way to locate the audio bytes within a WAVE file.

Reading WAVE files properly must really begin as an exercise in locating and 
identifying RIFF subchunks. RIFF subchunks have an 8-byte header: 4 bytes for 
an identifier/name field which is traditionally filled with human-readable 
ASCII characters (e.g. "fmt "), and a 4-byte little-endian unsigned integer 
specifying the number of bytes in the subchunk's data payload -- the subchunk's 
data payload follows immediately after its 8-byte header.

The WAVE file format reserves certain subchunk identifiers (or "names") as 
being meaningful to the WAVE format. There are a minimum of two subchunks that 
must always appear in every WAVE file:

"fmt " - the subchunk with this identifier has a payload which describes the 
basic information about the audio's format: sample rate, bit depth, etc.
"data" - the subchunk with this identifier has the actual audio bytes in its 
payload
"fact" is the next most common subchunk identifier. It is only valid in WAVE 
files that use a compressed codec, such as μ-law (as opposed to PCM, which is 
not compressed). See http://www.sonicspot.com/guide/wavefiles.html for more 
information about some of the various subchunk identifiers in use today in the 
wild, and information about their payload structure.

From a purely RIFF perspective, subchunks need not appear in any particular 
order in the file, or at any particular fixed offset. In practice however, 
almost all software expects the "fmt " subchunk to be the first subchunk. This 
is a concession to practicality: it is convenient to know early in the data 
stream what format of audio the WAVE contains -- this makes it easier to play a 
wave file from a network stream, for example. If the WAVE file uses a 
compressed format, such as μ-law, it is usually assumed that the "fact" 
subchunk will appear directly after "fmt ".

After the format-specifying chunks are out of the way, assumptions about the 
location, ordering, and naming of subchunks should be abandoned. At this point, 
the software should locate expected subchunks by name only (e.g. "data"). If 
subchunks are encountered that have unrecognized names (e.g. "FLLR"), those 
subchunks should simply be skipped over and ignored. Skipping a subchunk 
requires reading its length so that you can skip over the correct number of 
bytes.

What Apple has done with the "FLLR" subchunk is slightly unusual, and I'm not 
surprised that some software is tripped up by it. I suspect that the library 
you are using is simply unprepared to deal with the presence of the "FLLR" 
subchunk. I would consider this a defect in the library. The mistake the 
library authors have made is probably something like:

They may be expecting the "data" subchunk to appear within the first N bytes of 
the beginning of the file, where N is something less than ~4kB. They may give 
up looking if they have to scan too far into the file. The Apple "FLLR" 
subchunk pushes the "data" subchunk to a position >~4kB into the file.

They may be expecting the "data" subchunk to have a specific ordinal subchunk 
position or byte offset within the RIFF. Perhaps they expect "data" to appear 
immediately after "fmt ". This is an incorrect way to process a RIFF file, 
though. The ordinal position and/or offset position of the "data" subchunk 
should not be assumed.

As long as we're talking about correct WAVE file processing, I might as well 
remind everyone that the audio bytes (the data subchunk's payload) may not run 
exactly to the end of the file. It is allowable to insert subchunks after the 
data payload. Some programs use this to store a textual "comment" field at the 
end of the file. If you read blindly from the start of the data payload until 
the EOF, you may pull in some metadata subchunks as audio, which will sounds 
like a "click" at the end of playback. You need to honor the length field of 
the data subchunk and stop reading audio once you've consumed the entire data 
payload -- not stop when you hit EOF.

(from 
http://stackoverflow.com/questions/6284651/avaudiorecorder-doesnt-write-out-prop
er-wav-file-header)

Original issue reported on code.google.com by xtrm...@gmail.com on 2 Dec 2013 at 7:12

GoogleCodeExporter commented 8 years ago
Yes all sections are skipped. But your file has a sample rate 32kHz which is 
not allowed. Give a try @16000 Hz!

Original comment by bson...@gmail.com on 2 Dec 2013 at 8:24

GoogleCodeExporter commented 8 years ago
OK, I'll give that a try.

If 32kHz is not allowed, the manual should be updated:

https://code.google.com/p/opentx/wiki/OpenTx_FrSky_EN#Audio

Original comment by xtrm...@gmail.com on 3 Dec 2013 at 6:36

GoogleCodeExporter commented 8 years ago
If you wish to create your own files, the required format is:

   - WAV, 8 or 16 bit, Mono
   - 8, 16 or 32kHz sample rate
   - PCM, u-law or a-law compression

The stock sounds above use the best available quality, i.e. 16bit, 32kHz
and PCM.

Original comment by xtrm...@gmail.com on 3 Dec 2013 at 6:36

GoogleCodeExporter commented 8 years ago
Actually I just checked what I did and you are right, I allowed up to 32kHz as 
a sample rate.

Would you please attach one of your not working files here?

Original comment by bson...@gmail.com on 4 Dec 2013 at 6:56

GoogleCodeExporter commented 8 years ago
Fixed on SVN. Will be in next minor release

Original comment by bson...@gmail.com on 30 Dec 2013 at 8:13

GoogleCodeExporter commented 8 years ago
Issue 206 has been merged into this issue.

Original comment by bson...@gmail.com on 30 Dec 2013 at 8:13