Closed snakers4 closed 4 years ago
Hello @snakers4,
unfortunately it's quite difficult to do the task you're trying to achieve. Internally, the buffer is just an array of shorts (short is a small integer) with every even index being the left channel audio data and every odd index being the right channel audio data (thus what you hear is stereo). That's why it appears to be 5 seconds long if you read it as a whole.
file.buffer
is basically just a pointer (i.e. an address in memory) to the first short of that array.
That's why you need to stop after the end of the buffer has reached, because otherwise you'll simply continue reading values from memory.
I think the easiest way to solve this would be as follows: Firstly, to improve performance, you'll need ctypes - your python provided way of using values that come from C or C++ (such as pointers). No worries, it comes with Python, so no need to install anything.
Next you'll need numpy, which you already have.
The first step will be to turn the pointer into a numpy array. The following code does the trick:
import ctypes, numpy, pyogg
[...]
file = pyogg.OpusFile("detodos.opus")
target_datatype = ctypes.c_short * file.buffer_length
buffer_as_array = ctypes.cast(file.buffer, ctypes.POINTER(target_datatype)).contents
buffer_as_numpy_array = numpy.array(buffer_as_array)
Now we need to reorganize the numpy array to a 2d array as requested in the documentation I found.
left_data = buffer_as_numpy_array[0::2] # starting from 0, every second value
right_data = buffer_as_numpy_array[1::2]
final_data = numpy.array((left_data, right_data))
I think that should do it.
I hope the code doesn't contain any typos.
And I hope it helps you!
Hi,
Many thanks for your replies!
(i)
with every even index being the left channel audio data and every odd index being the right channel audio data
import pyogg file = pyogg.OpusFile("detodos.opus") print(file.channels)
It is weird, when I run this, I get
1
channel in this file. But on the other handfile.buffer_length
"says" that is is2
channels (sample rate * duration). It this some opus artefact?
(ii)
I think that should do it.
Many thanks for you example, I tried it. When I try to listen to it, it appeared to sound really weird and sped up. Then I tried some stereo files, and looks like this code snippet is the solution. Looks a bit weird to me. I checked by listening to the wavs I could decode.
import ctypes, numpy, pyogg
from IPython.display import Audio
# file = pyogg.OpusFile("detodos.opus") # mono
file = pyogg.OpusFile("ehren-paper_lights-64.opus") # stereo
target_datatype = ctypes.c_short * (file.buffer_length // 2) # always divide by 2 for some reason
buffer_as_array = ctypes.cast(file.buffer,
ctypes.POINTER(target_datatype)).contents
if file.channels == 1:
wav = numpy.array(buffer_as_array)
elif file.channels == 2:
wav = numpy.array((wav[0::2],
wav[1::2]))
else:
raise NotImplementedError()
(iii) Since we are here, a couple of questions
OpusEncoder
class and some class like this?# audio frequency (always 48000)
- can this be changed somehow? We are working with speech applications, and we are selecting a codec to store vast amounts of data now. Our colleagues said that opus
encoding actually improves (!) performance on downstream tasks. But for speech 48kHz seems very excessive. Ofc I can resample downstream using some fast method, but why store 3x data.(iv)
Maybe it is worth adding the above python example along with codec installation scripts to wiki / README.md so that it would be easier to use the library?
I have an ML themed telegram channel with 2k people reading it, I could tell people about opus
and how they can easily work with it in python!
Hi again,
okay.
(i)
You're totally right. I didn't realize the file was mono.
The reason why the buffer_length
is always twice the actual length is because it's multiplied by two in the code (I don't know why -- and I don't know how even still PyOpenAL plays my opus files loaded by PyOgg just fine ...).
(ii) That's also why you needed to divide it here.
(iii)
(iv) Thank you for the offer :) Though at the current state using PyOgg is way too complicated and error prone. I originally created this library for my own needs and I barely knew what I was doing. I'm definitely willing to give this library a cleanup and improve it's functionality, but I suppose that will take some time. When I've got the time I'll push a quickfix for the buffer length - though that isn't really a "solution".
Cheers, --Zuzu_Typ--
Thank you for the offer :)
I think I will will cover the available options when I will be writing a post The channel is located here btw
Though at the current state using PyOgg is way too complicated and error prone.
Correct me if I am wrong, but it looks like there is no proper in-memory library to work with opus
files (?). There is pysoundfile
, which is really nice, but it is built on top of libsoundfile
, which does not support opus
(it supports vorbis
, though).
Technically it does support it, but there are no binaries available, etc
I tried using the packaged version in 18.04, but there is still no support.
I'm definitely willing to give this library a cleanup and improve it's functionality, but I suppose that will take some time. When I've got the time I'll push a quickfix for the buffer length - though that isn't really a "solution".
Are you planning on adding the write functionality?
Though at the current state using PyOgg is way too complicated and error prone.
Do you think that even if there was a class for writing files, your library is not suitable for production usage, i.e. there may be memory leaks?
Correct me if I am wrong, but it looks like there is no proper in-memory library to work with opus files (?)
I don't really know any other libraries that don't have massive overhead in terms of unnecessary frameworks and functionality.
Are you planning on adding the write functionality?
Yes, that should be part of a library that claims to give access to Ogg, FLAC and Opus' functionality.
Do you think that even if there was a class for writing files, your library is not suitable for production usage, i.e. there may be memory leaks?
If I take the time and care, I'm pretty certain that I can make it production ready. Of course, there may always be memory leaks, but none that can't be fixed.
Gave your library a shout-out here https://t.me/snakers4/2385
Keep up the good work! =)
Closing this issue. This repository now includes an example of how to read and play Opus-encoded audio using PyOgg (see the file examples/01-play-opus-simpleaudio.py
). There is also an example of how to write Opus-encoded audio (see examples/03-write-ogg-opus.py
). Both can now be achieved with no requirements for the user to be even aware of the ctypes
interface.
Hi @Zuzu-Typ ,
Many thanks for your library, it seems to be working, but I am facing some issues. I managed to successfully load the library on
Ubuntu-18.04
after running these commands (some of them may be redundant)After that I could open and listen to an
opus
file like this:Looks like there are a few problems
Please tell if I am doing something wrong!