Problems reading opus files

snakers4 commented 4 years ago

Hi @Zuzu-Typ ,

Many thanks for your library, it seems to be working, but I am facing some issues. I managed to successfully load the library on Ubuntu-18.04 after running these commands (some of them may be redundant)

pip install PyOgg
conda install -c anaconda libopus
apt install libopus-dev
apt install libopusfile0

After that I could open and listen to an opus file like this:

import pyogg
import numpy as np
from IPython.display import Audio

file = pyogg.OpusFile("detodos.opus")

wav = []
c = 0

for _ in file.buffer:
    wav.append(_)
    c+=1
    if c > file.buffer_length:
        break

wav = np.array(wav)
Audio(wav, rate=file.frequency)

Looks like there are a few problems

Looks like if I just iterate over the buffer without breaking, there is an infinite loop somewhere
The audio should be ~2.5s long, but when I listen to it, it is ~5s long and the second half is filled with some loud artefacts

Please tell if I am doing something wrong!

Zuzu-Typ commented 4 years ago

Hello @snakers4,

unfortunately it's quite difficult to do the task you're trying to achieve. Internally, the buffer is just an array of shorts (short is a small integer) with every even index being the left channel audio data and every odd index being the right channel audio data (thus what you hear is stereo). That's why it appears to be 5 seconds long if you read it as a whole.

file.buffer is basically just a pointer (i.e. an address in memory) to the first short of that array. That's why you need to stop after the end of the buffer has reached, because otherwise you'll simply continue reading values from memory.

I think the easiest way to solve this would be as follows: Firstly, to improve performance, you'll need ctypes - your python provided way of using values that come from C or C++ (such as pointers). No worries, it comes with Python, so no need to install anything.

Next you'll need numpy, which you already have.

The first step will be to turn the pointer into a numpy array. The following code does the trick:

import ctypes, numpy, pyogg

[...]

file = pyogg.OpusFile("detodos.opus")

target_datatype = ctypes.c_short * file.buffer_length
buffer_as_array = ctypes.cast(file.buffer, ctypes.POINTER(target_datatype)).contents
buffer_as_numpy_array = numpy.array(buffer_as_array)

Now we need to reorganize the numpy array to a 2d array as requested in the documentation I found.

left_data = buffer_as_numpy_array[0::2] # starting from 0, every second value
right_data = buffer_as_numpy_array[1::2]
final_data = numpy.array((left_data, right_data))

I think that should do it.

I hope the code doesn't contain any typos.

And I hope it helps you!

snakers4 commented 4 years ago

Hi,

Many thanks for your replies!

(i)

with every even index being the left channel audio data and every odd index being the right channel audio data
import pyogg
file = pyogg.OpusFile("detodos.opus")
print(file.channels)
It is weird, when I run this, I get 1 channel in this file. But on the other hand file.buffer_length "says" that is is 2 channels (sample rate * duration). It this some opus artefact?

(ii)

I think that should do it.

Many thanks for you example, I tried it. When I try to listen to it, it appeared to sound really weird and sped up. Then I tried some stereo files, and looks like this code snippet is the solution. Looks a bit weird to me. I checked by listening to the wavs I could decode.

import ctypes, numpy, pyogg
from IPython.display import Audio

# file = pyogg.OpusFile("detodos.opus")  # mono
file = pyogg.OpusFile("ehren-paper_lights-64.opus")  # stereo

target_datatype = ctypes.c_short * (file.buffer_length // 2)  # always divide by 2 for some reason
buffer_as_array = ctypes.cast(file.buffer,
                              ctypes.POINTER(target_datatype)).contents
if file.channels == 1:
    wav = numpy.array(buffer_as_array)
elif file.channels == 2:
    wav = numpy.array((wav[0::2],
                       wav[1::2]))
else:
    raise NotImplementedError()

(iii) Since we are here, a couple of questions

Can I encode files using your library? If so, do I need to use the OpusEncoder class and some class like this?
# audio frequency (always 48000) - can this be changed somehow? We are working with speech applications, and we are selecting a codec to store vast amounts of data now. Our colleagues said that opus encoding actually improves (!) performance on downstream tasks. But for speech 48kHz seems very excessive. Ofc I can resample downstream using some fast method, but why store 3x data.

(iv) Maybe it is worth adding the above python example along with codec installation scripts to wiki / README.md so that it would be easier to use the library? I have an ML themed telegram channel with 2k people reading it, I could tell people about opus and how they can easily work with it in python!

Zuzu-Typ commented 4 years ago

Hi again,

okay. (i) You're totally right. I didn't realize the file was mono. The reason why the buffer_length is always twice the actual length is because it's multiplied by two in the code (I don't know why -- and I don't know how even still PyOpenAL plays my opus files loaded by PyOgg just fine ...).

(ii) That's also why you needed to divide it here.

(iii)

Technically, yes, you could encode files using PyOgg, but you would have to use the raw bindings to C code, which is a little cumbersome to deal with.
Unfortunately no. The 48000Hz is a decoder restriction. You should be able to decode files with lower frequencies none the less though. They're simply converted to 48kHz.

(iv) Thank you for the offer :) Though at the current state using PyOgg is way too complicated and error prone. I originally created this library for my own needs and I barely knew what I was doing. I'm definitely willing to give this library a cleanup and improve it's functionality, but I suppose that will take some time. When I've got the time I'll push a quickfix for the buffer length - though that isn't really a "solution".

Cheers, --Zuzu_Typ--

snakers4 commented 4 years ago

Thank you for the offer :)

I think I will will cover the available options when I will be writing a post The channel is located here btw

Though at the current state using PyOgg is way too complicated and error prone.

Correct me if I am wrong, but it looks like there is no proper in-memory library to work with opus files (?). There is pysoundfile, which is really nice, but it is built on top of libsoundfile, which does not support opus (it supports vorbis, though).

Technically it does support it, but there are no binaries available, etc

I tried using the packaged version in 18.04, but there is still no support.

I'm definitely willing to give this library a cleanup and improve it's functionality, but I suppose that will take some time. When I've got the time I'll push a quickfix for the buffer length - though that isn't really a "solution".

Are you planning on adding the write functionality?

Though at the current state using PyOgg is way too complicated and error prone.

Do you think that even if there was a class for writing files, your library is not suitable for production usage, i.e. there may be memory leaks?

Zuzu-Typ commented 4 years ago

Correct me if I am wrong, but it looks like there is no proper in-memory library to work with opus files (?)

I don't really know any other libraries that don't have massive overhead in terms of unnecessary frameworks and functionality.

Are you planning on adding the write functionality?

Yes, that should be part of a library that claims to give access to Ogg, FLAC and Opus' functionality.

Do you think that even if there was a class for writing files, your library is not suitable for production usage, i.e. there may be memory leaks?

If I take the time and care, I'm pretty certain that I can make it production ready. Of course, there may always be memory leaks, but none that can't be fixed.

snakers4 commented 4 years ago

Gave your library a shout-out here https://t.me/snakers4/2385

Keep up the good work! =)

mattgwwalker commented 4 years ago

Closing this issue. This repository now includes an example of how to read and play Opus-encoded audio using PyOgg (see the file examples/01-play-opus-simpleaudio.py). There is also an example of how to write Opus-encoded audio (see examples/03-write-ogg-opus.py). Both can now be achieved with no requirements for the user to be even aware of the ctypes interface.

TeamPyOgg / PyOgg

Problems reading opus files #19