dancasimiro / WAV.jl

Julia package for working with WAV files
Other
84 stars 35 forks source link

Request: wavread() returns compression #24

Closed Whistler7 closed 9 years ago

Whistler7 commented 9 years ago

I prefer to use wavread() with format="double". Sometimes, my processing depends on the compression format of the WAV file read. nbits provides this information for most cases, but not for the 32-bit case. This is because PCM and floating-point are both possible for 32 bits.

I request that wavread() returns compression as an additional value. For backwards compatibility, it may be best to return it as the last value so it can be ignored by the parent code. The returned compression value may be useful for a subsequent call to wavwrite().

dancasimiro commented 9 years ago

Maybe I can return an object. I will probably export a function with a different name. Do you have any suggestions.

Whistler7 commented 9 years ago

While I am very experienced with signal processing, I have just started learning Julia and porting code from Python. I am not familiar with all the options available to solve this issue, yet maintain backwards compatibility. By object, do you mean a custom composite type instance? I can imagine defining a composite type to hold the attributes of the WAV file, except I would leave the audio samples as a standard array outside the composite type. This composite type could be used by new wavread() and wavwrite() methods. A constructor method could define default values for attributes within a new composite type instance to be used by wavwrite(). Maybe this is what you were already thinking.

dancasimiro commented 9 years ago

I could expose the internal WAVFormat type. The wavread function already returns a fourth parameter that I always set to None. Does this contain all of the extra information that you need?

# Required WAV Chunk; The format chunk describes how the waveform data is stored
type WAVFormat
    compression_code::UInt16
    nchannels::UInt16
    sample_rate::UInt32
    bps::UInt32 # average bytes per second
    block_align::UInt16
    nbits::UInt16
    extra_bytes::Array{UInt8, 1}

    data_length::UInt32

    WAVFormat() = new(0, 0, 0, 0, 0, 0, [], 0)
    WAVFormat(comp, chan, fs, bytes, ba, nbits) = new(comp, chan, fs, bytes, ba, nbits, [], 0)
end

extra_bytes is a little tricky; it's determined by the extension. I have a helper type to extract data from it: WAVFormatExtension

# used by WAVE_FORMAT_EXTENSIBLE
type WAVFormatExtension
    valid_bits_per_sample::UInt16
    channel_mask::UInt32
    sub_format::Array{UInt8, 1} # 16 byte GUID

    WAVFormatExtension() = new(0, 0, b"")
    WAVFormatExtension(vbsp, cm, sb) = new(vbsp, cm, sb)
end
Whistler7 commented 9 years ago

Yes, the internal WAVFormat type has the compression_code that I am looking for, so returning that would work.

Does WAV.jl presently handle Wave64/RF64 files? My digital audio workstation software uses this when the file size exceeds 4GB, so I request this feature. See https://en.wikipedia.org/wiki/RF64

My WAV files contain metadata such as artist, album, title, track, disc and composer. I suspect it is in the BWF ("bext") chunk. When I process a WAV file, I would like wavread() to return this info, and wavwrite() to use this info. Is this contained in extra_bytes? Does WAV.jl support this scenario?

dancasimiro commented 9 years ago

RF64

Sorry, WAV.jl does not support wave64 files.

Metadata

Your comment makes me think that the metadata is in a different chunk. It is not part of the extra_bytes field. I could return a dictionary with all of the extra chunks from the file, including fmt. The library could return the raw bytes when it doesn't know how to decode the bits.

Whistler7 commented 9 years ago

I think I don't really need Wave64/RF64 support. I computed that the WAV file limit of 4GB provides 23 minutes of 192-kHz 64-bit stereo samples. I think I need to break up longer recordings into songs or movements before processing.

Metadata can be in the bext chunk. See https://en.wikipedia.org/wiki/Broadcast_Wave_Format Metadata can also be in a 'LIST' chunk with a HeaderID of 'INFO'. See https://en.wikipedia.org/wiki/WAV If you are enthusiastic about supporting metadata, then I would like to read and write dictionaries containing the metadata. I understand, however, that this is a big request for a volunteer developer, especially if you won't be using this feature. I won't feel bad if you decline. Better for me to fork WAV.jl than have you spend time on a partial solution.

I have been learning more about Julia. Unlike Python and MATLAB, Julia doesn't throw an exception or change the function behavior if some of the tuple return values from a function aren't assigned. Therefore, one way to satisfy this PR is to add compression (compression_code from WAVFormat object) to the end of the tuple returned by wavread().

Perhaps a better way to satisfy this PR is to create a new function wavinfo(). The argument would be a filename or IO stream. It would provide all of the WAVFormat info on a WAV file, without having to read any audio data. So, in addtion to the compression code that I want, it would provide the audio data length, which could be handy. This may also be useful for exception handling based on the number of channels in the WAV file. I propose that wavinfo() simply return the WAVFormat object. The parent code can then access the desired fields by name. The WAVFormat type may have new fields in the future, so accessing them by name ensures future compatibility.

dancasimiro commented 9 years ago

I just pushed a commit to master to returns the fmt chunk, as an instance of type WAVFormat within a Dict{Symbol, Any} from wavread. You can use it as follows:

using WAV
data, fs, nbits, opt = wavread("myfile.wav")
opt[:fmt].compression_code

Your code will have to handle the WAVE_FORMAT_EXTENSIBLE code because I didn't export a function to do that. It might be a good addition to the library though. What do you think?

Whistler7 commented 9 years ago

I like the concept of using a dictionary to provide a future-compatible way of passing optional variables. Would I literally put opt[:fmt].compression_code in my code? Otherewise, I don't understand what to use for the symbol :fmt.

I am not familiar enough with the WAV file specification to understand when I would encounter WAVE_FORMAT_EXTENSIBLE. All I really want is a flag indicating whether the returned data is PCM or floating-point, so I can handle the different cases for nbits=32. I was expecting that compression_code would be WAVE_FORMAT_PCM, WAVE_FORMAT_IEEE_FLOAT, WAVE_FORMAT_ALAW or WAVE_FORMAT_MULAW. I would then only need to compare compression_code to WAVE_FORMAT_IEEE_FLOAT to get my flag.

dancasimiro commented 9 years ago

I think that this is ready now. You can use the isformat function to test the type of compression used. Something like:

if isformat(fmt, WAV_FORMAT_IEEE_FLOAT)
    # do something
end