go-audio / wav

Battle tested Wav decoder/encoder
Apache License 2.0
300 stars 46 forks source link

Write() only accepts an IntBuffer? #20

Closed asciifaceman closed 1 year ago

asciifaceman commented 5 years ago

Having only a truncated IntBuffer feels extremely limiting, writing wav should accept a Float buffer - otherwise why are you using a lossless format.

mattetti commented 5 years ago

I agree that we should support writing a float buffer, at the time I wrote the API that wasn't my main concern since I mainly deal with int PCM data and couldn't find an elegant API. I am more than open to suggestions, as a matter of fact we have/had an issue for that. Your input would be greatly appreciated.

However, I think your argument about lossless vs lossy format is bit off. A lossy format would compress data resulting in signal loss. That's not the same problem as ints vs floats which offer more resolution/dynamic and only when the source is a float PCM.

asciifaceman commented 5 years ago

I suppose my argument was that the truncation of the float intrinsically affects the resultant waveform since the rounding effect changes (to a factor of a whole integer vs. a floating sub whole number round) - but poorly worded. Loss and increased noise floor can come from rounding errors without encoding, and floats contain a higher dynamic range inherently (a higher noise floor can be introduced by a bad int conversion).

I've been thinking about this, doing a lot of playing with the azul3d audio engine. Their floating model involves (typically) a PCM range from -1 to 1, however I've seen this broken on the occasional wav file. However, they are silently truncating to int16 and was unfinished / left in a half state.

I suppose an intermediary step would be to do similarly, accept a buffer of any (acceptable) types and then silently convert to int in the Writer to at least allow support to start - then investigate if they can be written raw.

var pi float64 = math.Pi
binary.Write(buf, binary.LittleEndian, pi)

binary should be able to LE float64s too , according to godoc

mattetti commented 5 years ago

Yes we also have a transform that converts the format but I agree that we should support f32 wav/aiff files. It's more a matter of API than anything else.

mattetti commented 5 years ago

Here is a similar conversation with more details about how to convert https://github.com/go-audio/audio/issues/18

asciifaceman commented 5 years ago

I'm kind of working on the interim solution in a fork, and there might be a change required to the upstream audio package.

My idea is to default to float64 and a new Interface for Data and using getter/setters so you only have to deal with its actual type once you get to the switch statements for adding buffer. Truncation only has to happen if you leave float64. I can whip up some introductory PRs, starting with the changes to the audio library so you can see what I mean (I realize this would be cascading changes, just want to hopefully show the idea)

mattetti commented 5 years ago

One thing to keep in mind is that float64 is extremely rare in audio and it might be annoying/weird to ask to copy a float32 buffer to float64 just because of the API. I'd love to see what you have I in mind, also be careful about performance. If most is cases rely on ints, we want to be careful about not using a much slower data type (especially on devices like the Pi)

asciifaceman commented 5 years ago

Hmm, I would wonder if float32 default would be better then.

I would argue that float32 is considered faster for processing audio, which is why most DAWs (including audacity / protools / ableton etc) has migrated to Float PCM data from fixed integers over the years.

Obviously I look forward to testing this, vs appealing to authority.

asciifaceman commented 5 years ago

Also it could mean collapsing the multiple buffers into a single buffer that just accepts Data interface

asciifaceman commented 5 years ago

Realized quickly how uphill of a battle it would be to refactor, so I'm going to instead try to spike an alternative as an example to show you sometime in the near future when it's done

chfanghr commented 5 years ago

I had made some changes in package aiff but it didn't work well . After encoding I heard lots of noise. Do you have any ideas? https://github.com/chfanghr/aiff/blob/master/encoderf32.go

mattetti commented 5 years ago

I don't see you using the PCMScale transform defined here: https://github.com/go-audio/transforms/blob/master/pcm_scale.go#L27

You have values that are between -1 and +1 as floats and you want to encode them as int based on the bitdepth set on the encoder. You therefore need to convert/transform your buffer to scale it to the int range. That's what this function does.

asciifaceman commented 5 years ago

Addendum:

the more I think about it, I still think the interface should be generalized to accept any of the options. While I personally believe float should be the base format the lib operates in, to maintain parity with modern audio software, at least accepting and silently dealing with f32 would be good so the operator could choose what they were working with at least. Changing the underlying structure to f32 would be daunting (I've spent my time reading through and seeing just how wide of an impact that would be and it basically rewrites your entire library)

Conclusion (imho):

Write() should accept a buffer interface, discover type and convert silently as needed - preferably starting from float since it has the highest precision factoring for rounding errors. I opened a PR against go-audio/audio implementing a Type() method on the Buffer interface so that packages such as Wav can discover the type to prevent needless conversions. Then, within wav it would be implemented something like:

// Write encodes and writes the passed buffer to the underlying writer.
// Don't forger to Close() the encoder or the file won't be valid.
func (e *Encoder) Write(buf audio.Buffer) error {
    if !e.wroteHeader {
        if err := e.writeHeader(); err != nil {
            return err
        }
    }

    var intBuf audio.IntBuffer
    // Quietly convert to int
    if buf.Type() != reflect.Int {
        intBuf := buf.AsIntBuffer()
    } else {
        intBuf = buf
    }

    if !e.pcmChunkStarted {
        // sound header
        if err := e.AddLE(riff.DataFormatID); err != nil {
            return fmt.Errorf("error encoding sound header %v", err)
        }
        e.pcmChunkStarted = true

        // write a temporary chunksize
        e.pcmChunkSizePos = e.WrittenBytes
        if err := e.AddLE(uint32(42)); err != nil {
            return fmt.Errorf("%v when writing wav data chunk size header", err)
        }
    }

    return e.addBuffer(intBuf)
}

What do you think Matt? I tried to consider the least amount of changes to at least support Write() accepting FloatBuffer without actually changing much - I suppose a first step in the right direction

chfanghr commented 5 years ago

@mattetti I think you misunderstood what I try to do. According to aiff documentation aiff can store pcm data in float32 . So what I tried to do is to directly store float32 pcm data into a aiff file implemented in encodef32.go as I mentioned above.

segur7g commented 1 year ago

Best regards, can I ask a simple question here? Why is my WAV not processing?

`package main

import ( "fmt" "log" "os"

"github.com/go-audio/audio"
"github.com/go-audio/wav"

)

func main() {

file, err := os.Open("test2.wav")
if err != nil {
    log.Fatal(err)
}
defer file.Close()

wavDecoder := wav.NewDecoder(file)
_, err = file.Seek(0, 0)
if err != nil {
    fmt.Println("Error opening the WAV file:", err)
    return
}

buffer := audio.IntBuffer{}
samples, err := wavDecoder.PCMBuffer(&buffer)
if err != nil {
    fmt.Println(err)
    return
}

floats := make([]float64, len(buffer.Data))
for i := 0; i < len(buffer.Data); i++ {
    floats[i] = float64(buffer.Data[i])
}
fmt.Printf("Is this file valid: %t\n", wavDecoder.IsValidFile())
fmt.Printf("PCM data was accessed: %t\n", wavDecoder.WasPCMAccessed())
fmt.Println(samples)
fmt.Println(len(buffer.Data))
fmt.Println(floats)

} ` output: Is this file valid: true PCM data was accessed: true 0 0 []

asciifaceman commented 1 year ago

I haven't touched this issue in a long time and it's being misused now so I'm going to close it.

Since I opened this issue go generics have come out with 1.18 which would be a much better way to go about what was being discussed here than anything I came up with back then. I might revisit in the future and if I do it will be in another issue that is more targeted.