roc-streaming / roc-toolkit

Real-time audio streaming over the network.
https://roc-streaming.org
Mozilla Public License 2.0
1.04k stars 207 forks source link

libsndfile source and sink #246

Closed gavv closed 4 months ago

gavv commented 5 years ago

Last revised: Oct 2023

Update: Issue is still relevant. Updated to reflect nowadays codebase.

Intro

libsndfile is a minimalistic cross-platform C library allowing to read and write audio files in several audio formats.

Currently roc-send can read files using SoxSource and roc-recv can write files using SoxSink. These source and sink implementation use SoX, which can work both with audio files and devices.

However, SoX has some disadvantages:

In other issues like #231 and #232 we will implement sources and sinks for ALSA, PulseAudio, CoreAudio, and Android.

It would be also nice to be able to read and write files using libsndfile instead of SoX:

Implementation

So in this issue, we should:

Implementation notes

Build system

We should also add sndfile support to scons scripts, basically just duplicate what we're doing for sox and adjust it for sndfile, specifically:

Unit tests

We already have unit tests for SoxSource and SoxSink. It would be nice to reuse them for SndfileSource and SndfileSink.

This would be easy: we already have ISource and ISink interfaces, so we should just make every test to run on two backends (sox and sndfile).

We already have an example of such tests -- here every test has a for-loop iterating over all supported FEC schemes.

gavv commented 5 years ago

This task is not very small, but it is quite straightforward and detaily explained, so I think it should be good enough for newcomers.

gavv commented 5 years ago

Here you can find an example of reading a file with libsndfile: 1, 2.

GwenTheKween commented 4 years ago

Hello. I'm a student in Computer Science, in my fourth year of study and I'd like to help resolve this issue.

This semester I need to choose a project to contribute to, and the help of a maintainer of the project, to ensure that my push requests will not just be ignored, or reviewed after the deadlines of the university. It shouldn't be much work to the maintainer, just adding constructive feedback on requests and such. Could you be this helper for the project?

Thanks in advance

gavv commented 4 years ago

@billionai sure, you're welcome!

gavv commented 4 years ago

See also #251. It provides an overview of roc_sndio concepts.

GwenTheKween commented 4 years ago

Thank you! I'm at a little bit of a loss as to how your sox classes work, so that'll help a lot!

GwenTheKween commented 4 years ago

I'm having some trouble finding the dependencies for libsndfile. In this page I found (http://www.linuxfromscratch.org/blfs/view/svn/multimedia/libsndfile.html) it only lists flac, ogg and vorbis, and says alsa and SQlite (?) are optional dependencies. Are those the ones used in SContruct?

gavv commented 4 years ago

it only lists flac, ogg and vorbis, and says alsa and SQlite (?) are optional dependencies

That's right.

Actually all these dependencies are optional. If libsndfile was built with libFLAC, libogg, and libvorbis, it will use them to open corresponding file formats. Otherwise these formats will not be supported by libsndfile.

ALSA and sqlite are needed only for libsndfile tools and tests. The library itself does not depend on them.

Roc gives you two modes for every dependency:

The first mode is the default and it is generally preferred and recommended. The second mode is a kind of fallback that can be useful if you can't or don't want to build the dependencies by yourself and want to get something working and portable as fast as possible.

It is particularly useful when cross-compiling. It's quite time-consuming to setup a sysroot and cross-compile dependencies, and by employing --build-3rdparty option you can get a minimal working version of Roc tools with statically linked dependencies (most of them), so that you can just copy these binaries into the target system and run them.

So, answering to your question. When the user will specify --build-3rdparty=...libsndfile... or --build-3rdparty=all, Roc will have to build a minimal static version of libsndfile without dependencies. Otherwise, Roc will have to search for the system version of libsndfile and link with it. On most distros libsndfile will be built with all dependencies enabled.

gavv commented 4 years ago

Our 3rdparty.py script builds libsndfile with --disable-external-libs which means that flac, ogg, and vorbis are disabled.

https://github.com/roc-project/roc/blob/master/scripts/3rdparty.py#L472-L474

https://github.com/erikd/libsndfile/blob/6978654319aacb625c0f41eb8457b86000748c73/configure.ac#L152-L153

GwenTheKween commented 4 years ago

Hey, I was coding the sink for sndfile and noticed that I need to pass a number as a format, that would be derived from the driver passed (I'm guessing). What formats should be supported, and what their strings will look like? ("wav" or "wavefile"?)

This is the only thing left to do for the SndfileSink, and I can start working on the source. Sorry about the delay, the mixture of inexperience and difficult documentation along with university made the progress be much slower than I expected.

gavv commented 4 years ago

Hey, I was coding the sink for sndfile and noticed that I need to pass a number as a format, that would be derived from the driver passed (I'm guessing).

Yep.

What formats should be supported,

It would be great if we will support all formats supported by libsndfile.

and what their strings will look like? ("wav" or "wavefile"?)

For formats that are also supported by SoX, we should use the same string name as SoX uses, so that the --format command-line option will work the same for both backends.

For formats that are not supported by SoX (if any) we can use the (common) extension name as the format name.

You can see the list of currently supported formats by running roc-recv -L or roc-send -L .

This is the only thing left to do for the SndfileSink, and I can start working on the source.

Cool!

Sorry about the delay, the mixture of inexperience and difficult documentation along with university made the progress be much slower than I expected.

That's alright :)

GwenTheKween commented 4 years ago

I have a couple of questions before I can move to the backend:

First, about the 'isfile' member: Sndfile doesn't have a clean way to check if the opened descriptor is a device or a file, but it can tell you if the opened descriptor is seekable. I think we could use this to determine whether it is a file or not, but another problem emerges. In libsndfile version 1.0.26, there is a known issue that sometimes the seekable parameter is incorrectly set, and that was fixed in version 1.0.27. Should I use this and leave that as a known issue, or try to find another way, maybe by parsing the 'input/output' string and checking if it start with "/dev"?

Second, most of the drivers from sox aren't format the way sndfile recognizes them, just wav and core audio. Should I just ignore the rest and use sndfile's extensions, or am I missing something?

gavv commented 4 years ago

Sndfile doesn't have a clean way to check if the opened descriptor is a device or a file, but it can tell you if the opened descriptor is seekable.

I don't quite understand what devices are you talking about.

In SoX backend, input or output may be either a path on filesystem (e.g. .wav file) or a device name specific to some sound system, e.g. ALSA device or PulseAudio sink.

If input or output is a device, it's not a path and it usually does not correspond to any file in /dev (though the second may be true for some sound systems, e.g. OSS). SoX uses sound system-specific API (e.g. libasound or libpulse) to access these devices.

Generally, this has nothing to do with UNIX device files that may be found in /dev. is_file_ is true for all files, which are usually regular files or pipes. is_file_ is false for sound system-specific devices.

AFAIK, libsndfile does not support any audio devices, only files. Does it?

If it doesn't, has_clock() should just always return false for sndfile sinks and sources.

Second, most of the drivers from sox aren't format the way sndfile recognizes them, just wav and core audio. Should I just ignore the rest and use sndfile's extensions, or am I missing something?

Do you mean driver names?

As for the names, I guess we should map the file format name to SF_FORMAT_ constant. We should use the same name as in SoX or common extension name if SoX doesn't support the format.

It seems there most formats are supported both by libsndfile and SoX, e.g.:

sox      libsndfile
--------------------------
"aiff"   SF_FORMAT_AIFF
"au"     SF_FORMAT_AU
"pvf"    SF_FORMAT_PVF
"flac"   SF_FORMAT_FLAC
"ogg"    SF_FORMAT_OGG
"s8"     SF_FORMAT_PCM_S8
...

Actually it's hard to find formats that are supported by libsndfile but not supported by SoX. But it seems there are some, e.g.:

libsndfile          common extension
----------------------------------------
SF_FORMAT_RF64      rf64
SF_FORMAT_G721_32   g721
...

Here is what SoX supports on my system:

$ roc-recv -L         
...

supported formats for audio files:
  .aifc .aiffc .aiff .aif .al .au .snd .avr .cdda .cdr .cvsd .cvs .cvu .dat
  .dvms .vms .f32 .f64 .gsrt .hcom .htk .ima .la .lu .maud .prc .s8 .s16
  .s24 .s32 .sf .ircam .sln .smp .sndr .sndt .sox .sph .nist .8svx .txw
  .u8 .sou .fssd .u16 .u24 .u32 .ul .voc .vox .wav .wavpcm .amb .wve .xa
  .flac .gsm .lpc10 .lpc .mp3 .mp2 .sds .caf .fap .mat4 .mat .mat5 .paf
  .pvf .sd2 .w64 .xi .vorbis .ogg

Maybe I misunderstood the question :)

GwenTheKween commented 4 years ago

AFAIK, libsndfile does not support any audio devices, only files. Does it?

As I'm used to linux, I figured that you'd handle a device in a similar way to a file with libsndfile, except it wouldn't be seekable. Otherwise I can't really see a reason for a music file to not be seekable, but maybe it's just inexperience. If libsndfile does, indeed, only support files I'll just set is_file_ to true.

Do you mean driver names?

Yes, that's what I meant. I had some trouble because roc-recv wasn't recognizing the option -L, but that was fixed by pulling from master :). Anyway, the problem was I was going solely by the code, and I thought what the driver_priorities was what I was looking for. I'll work on the mapping next.

That mapping feels like it should be part of the backend file, as both the source and sink should use it. Is that so?

Also, the simple mapping i'm thinking of (without using STL) would be 2 constant arrays, one of strings which contain the common extension or the same name as in SoX, and the other one with the driver code. Not necessarily the fastest method, but if we order the arrays by usage, it might be faster (on average) than using a binary search. Thoughts?

Thanks for the fast responses!

gavv commented 4 years ago

I can't really see a reason for a music file to not be seekable

I guess the only case when it's not seekable is a pipe, e.g.:

$ cat test.wav | roc-send -f wav -i file://- ...

If libsndfile does, indeed, only support files I'll just set isfile to true.

Yep.

Yes, that's what I meant. I had some trouble because roc-recv wasn't recognizing the option -L, but that was fixed by pulling from master :).

Yeah, it was added recently. BTW don't forget to rebase on fresh develop when preparing your PR.

You can also use:

$ sox -h | grep 'AUDIO FILE FORMATS'
AUDIO FILE FORMATS: 8svx aif aifc aiff aiffc al amb au avr cdda cdr cvs cvsd cvu dat dvms f32 f4 f64 f8 flac fssd gsm gsrt hcom htk ima ircam la lpc lpc10 lu maud mp2 mp3 nist ogg prc raw s1 s16 s2 s24 s3 s32 s4 s8 sb sf sl sln smp snd sndr sndt sou sox sph sw txw u1 u16 u2 u24 u3 u32 u4 u8 ub ul uw vms voc vorbis vox wav wavpcm wve xa

That mapping feels like it should be part of the backend file, as both the source and sink should use it. Is that so?

Agree. Backend can map string to constant and pass it to sink/source constructor.

Also, the simple mapping i'm thinking of (without using STL) would be 2 constant arrays, one of strings which contain the common extension or the same name as in SoX, and the other one with the driver code.

I think this is perfectly OK. This is not a critical path and there is no need to optimize it. Also, on small array sizes there will be very little difference between linear and binary search.

gavv commented 4 years ago

@billionai Hi, do you still have plans to finish this?

GwenTheKween commented 4 years ago

@gavv hi. Sorry, I think I bitten off more than I could chew. My lack of understanding of audio and the libsndfile documentation in general means I've made almost no progress other than copying the code from SoX and being confused about the functions that libsndfile doesn't have.

Thanks for your help, your guidance was really helpful and appreciated, but I won't be able to complete it in any acceptable time frame. You can open up the issue to someone else. Sorry for wasting your time

gavv commented 4 years ago

No problem, thanks for letting know! Maybe this discussion will help someone else. I'll unassign the issue so that somebody could pick it up.

gavv commented 11 months ago

We also have issue for adding FFmpeg, but this issue is still relevant because libsndfile is much more lightweight and may be preferred in many cases.

Hrick87 commented 9 months ago

Hi,

Is this issue still available and much needed? I'd like to try and make a bigger contribution with more meaningful impact and more complexity for my next contribution and this looks like a good issue for me. I may be a bit slow due to the size of issue, but I've brushed up on the docs and I believe I understand how to implement this. If that's all okay then I'd like to be assigned this issue.

Thanks.

gavv commented 9 months ago

Sure, thanks! Take your time.

gavv commented 9 months ago

@Hrick87 You may be interested in discussion in #576, I think details about rate & channels apply to sndfile backend too.

Hrick87 commented 9 months ago

@gavv While working on sndfile_source I came across an issue of data types and casting.

Inside the audio namespace many of the functions defined that are necessary for libsndfile's implementation return data types of size_t (long unsigned int). libsndfile's C API consistently wants to work with sf_count_t (signed int 64 bit) data types.

Right now I am casting between the two types in order to get things to compile but I worry about casting between a signed and an unsigned integer in the case of overflow or negative values being lost.

Do you have a suggested way to deal with this safely or is this actually not an issue at all?

gavv commented 9 months ago

It depends on context.

E.g. if you have size_t with frame size, you can be sure that it can't be huge (gigabytes), so you can just blindly cast to sf_count_t.

On the other hand, if you have sf_count_t value returned from library, and the library can return negative values (e.g. to indicate error), you need to check for negative values first, and only cast to size_t if it's positive.

Hrick87 commented 9 months ago

@gavv I've been working on sndfile_backend but something I don't understand is the purpose of the discover_drivers function in IBackend. Looking at the sox_backend implementation of it hasn't made it any clearer for me. Could I get an explanation? Thanks.

gavv commented 9 months ago

Each backend may support multiple drivers. One driver corresponds to one audio system or audio file format.

For example, drivers supported by sox backend are: pulse, alsa, wav, mp3, ...

Source or sink is identified by driver name + device or file name, e.g. driver is "alsa" and device/file is "hw:0", or driver is "wav" and device/file is "test.wav".

Devices are described by io uri like "alsa://hw:0". When backend dispatcher sees this uri, it selects backend that supports "alsa" driver and uses it.

Files are described by io uri in form "file://path". Which driver to use, i.e. what is the file format (wav, mp3, ...) is usually auto-detected from file extension. However, the user can also manually specify file format (driver) via separate cli option.

For sndfile backend, all drivers are file formats (no devices). So its discover_drivers method should return a list of supported file formats. See comment above: https://github.com/roc-streaming/roc-toolkit/issues/246#issuecomment-555446475

Hrick87 commented 9 months ago

Each backend may support multiple drivers. One driver corresponds to one audio system or audio file format.

For example, drivers supported by sox backend are: pulse, alsa, wav, mp3, ...

Source or sink is identified by driver name + device or file name, e.g. driver is "alsa" and device/file is "hw:0", or driver is "wav" and device/file is "test.wav".

Devices are described by io uri like "alsa://hw:0". When backend dispatcher sees this uri, it selects backend that supports "alsa" driver and uses it.

Files are described by io uri in form "file://path". Which driver to use, i.e. what is the file format (wav, mp3, ...) is usually auto-detected from file extension. However, the user can also manually specify file format (driver) via separate cli option.

For sndfile backend, all drivers are file formats (no devices). So its discover_drivers method should return a list of supported file formats. See comment above: #246 (comment)

Your response has immensely clarified things for me. The documentation states most of what you said, but the way you put it into your own words was so much easier to follow. Thank you for the fast response. I believe I understand now.

gavv commented 9 months ago

Great,

I've added documentation based on my comments above: https://github.com/roc-streaming/roc-toolkit/blob/develop/docs/sphinx/internals/audio_backends.rst

(it will appear on web site after next release)

Hrick87 commented 9 months ago

@gavv I have implemented the probe() function in sndfile_backend but I have some questions related to it.

1. Where should I be declaring ProbeFlag, where should I be defining it, and what determines if it is set?

I'm thinking it needs to be set if the driver_type is a file and it should be declared and defined somewhere inside backend_dispatcher before probe() is called.

2. When you say probe() should "fail" if the ProbeFile flag is not set, do you mean it should return false or roc_panic and terminate the program?

Right now I have probe() calling roc_log(debug) and returning false if ProbeFlag is not set.

3. Where should probe() be called?

I am thinking it must be called somewhere with a guarded if statement that determines if sndfile or sox will be used assuming both are present. If it returns true, then sndfile will be used. It is unclear to me where this would be, but my guess is somewhere in backend_dispatcher.cpp before an open_device() call.

Your guidance is appreciated, thank you.

gavv commented 9 months ago

@Hrick87 Oh, actually probe functionality was removed, and I forgot to remove it from task description, sorry!

Earlier, to decide which backend to use, backed dispatcher called probe() method on each backend, until one of them succeeds.

However later we realized that for most backends probe() will be implemented using open + check + close, so using probe() + open() leads to open + close + open again.

So probe logic was removed, and now backend dispatcher just tries to open file/device using each backend, until it succeeds.

gavv commented 9 months ago

So all you need is to implement open_device() method. If driver_type is DriverType_Device, it should just return NULL. If driver_type is DriverType_File, it should try to open file using libsndfile, and construct sink/source in case of success, or return NULL in case of failure.

Hrick87 commented 9 months ago

@Hrick87 Oh, actually probe functionality was removed, and I forgot to remove it from task description, sorry!

Earlier, to decide which backend to use, backed dispatcher called probe() method on each backend, until one of them succeeds.

However later we realized that for most backends probe() will be implemented using open + check + close, so using probe() + open() leads to open + close + open again.

So probe logic was removed, and now backend dispatcher just tries to open file/device using each backend, until it succeeds.

No worries! I thought this might be the case because as I was looking through commit history and saw that probe() used to be included in some of the backends, but wasn't anymore. I should've checked in before attempting to implement it.

Hrick87 commented 9 months ago

@gavv When running roc-recv using my sndfile implementation, watchdog reports all blank frames except for a single dropped frame (b's and a single i). My roc-send sndfile implementation works when roc-recv is using pulse audio as its backend. Thus I'm assuming two things, my roc-send implementation is correct, and my roc-recv implementation is done incorrectly.

I decided the best place to start debugging would be in my sndfile sink's write() function. I was using SoX's write() function as a reference and that's when I noticed I can't find a single instance of SoX's write() function being called anywhere. This leaves me completely confused how writing from the buffer to the newly created file or stream is occurring when using SoX.

Am I simply missing write()'s call somewhere in the code base?

gavv commented 9 months ago

write() is called from sndio::Pump, see also https://github.com/roc-streaming/roc-toolkit/blob/develop/docs%2Fsphinx%2Finternals%2Faudio_backends.rst

Watchdog monitors audio from network, before it goes to your source, so it's strange.

Note that some blank frames at the beginning of the session is normal.

Hrick87 commented 8 months ago

@gavv I have a full working implementation of sndfile and am nearly ready to make my PR, but the build fails on github actions for every single linux distro. I believe this is due to the fact that the Scons -Q command's build-3rdparty argument lacks "sndfile" on every docker image. I am quite new to github actions so maybe I am misunderstanding the issue and there is something I am doing wrong. Please let me know if it's on my end or something that needs to be fixed on your end.

Thank you!

EDIT: I see that the CI build tests are stored in the repository and that I could possibly go in and change each of them for you. I am happy to do this, or if because its beyond the scope of this issue/you want this solved a different way I won't change them.

gavv commented 8 months ago

@Hrick87 Indeed, I forgot about that. Yes, if you could add sndfile to docker images, it would be great.

We use github actions, but actual checks are run inside custom docker images, which are build from dockerfiles in this repo: https://github.com/roc-streaming/dockerfiles. You can even run those checks locally.

We'd need to add sndfile to all env-* images.

Here are some docs about CI: https://roc-streaming.org/toolkit/docs/development/continuous_integration.html

gavv commented 4 months ago

Landed.