nschlia / ffmpegfs

FUSE-based transcoding filesystem with video support from many formats to FLAC, MP4, TS, WebM, OGG, MP3, HLS, and others.
https://nschlia.github.io/ffmpegfs/
GNU General Public License v3.0
206 stars 14 forks source link

[FEATURE] Sample format for Audio files #101

Closed NyaomiDEV closed 2 years ago

NyaomiDEV commented 2 years ago

I have a bunch of 24 bit / >= 48KHz FLAC files. I would like to transcode them to the standard CD-quality 16 bit / 44.1KHz format, and as of now, I don't see a way of specifying the sample format in the man page. Therefore, I am requesting for a setting to be added as follows:

--audiosamplefmt=SAMPLEFMT, -o audiosamplefmt=SAMPLEFMT
    Set sample format.  SAMPLEFMT can be:

    0, 8, 16, 24 or 32.

    SAMPLEFMT is defined in bits.

    Setting SAMPLEFMT to 0 means that the original sample format is kept.

    Default: 0
nschlia commented 2 years ago

Should be easy to implement. I will add that.

nschlia commented 2 years ago

I checked if the parameter can be added. Actually it is not so easily possible. The FFmpeg API supports these formats:

U8: unsigned 8 bits S16: signed 16 bits S32: signed 32 bits FLT: float DBL: double

U8P: unsigned 8 bits, planar S16P: signed 16 bits, planar S32P: signed 32 bits, planar FLTP: float, planar DBLP: double, planar S64: signed 64 bits S64P: signed 64 bits, planar

This does not mean that the target necessarily supports all these. And selecting a format does not necessarily create the expected result. For example, creating a flac with S32: ffmpeg -i 'in.wav' -sample_fmt s32 'out.flac creates a 24 bit FLAC. s16 creates a 16 bit FLAC, while u8 fails.

For WAV, s16 seems to be supported only.

NyaomiDEV commented 2 years ago

This does not mean that the target necessarily supports all these. And selecting a format does not necessarily create the expected result.

I was more or less aware of this, actually

nschlia commented 2 years ago

This does not mean that the target necessarily supports all these. And selecting a format does not necessarily create the expected result.

I was more or less aware of this, actually

I think I will add the parameter anyway, but it requires a bit more work than expected:

  1. I'll have to check the capabilities of the selected format (codec) and report an error if the desired sample format is unavailable. Otherwise transcoding would fail over and over again every time a file is opened. This would be useless.
  2. I can't implement it in the above manner. I can only allow selecting U8, S16, S32 and so on. It's up to the API what happens. So S32 would sometimes actually create 32 bit files, sometimes 24 bit or fail in other cases.

For example, I could successfully create S16 WAV files with ffmpeg only. Other formats failed... For FLAC, S16 and S32 worked. That created 16 and 24 bit files, respectively.

Well, like said, I*ll add it but it will take some time :)

NyaomiDEV commented 2 years ago

This does not mean that the target necessarily supports all these. And selecting a format does not necessarily create the expected result. For example, creating a flac with S32: ffmpeg -i 'in.wav' -sample_fmt s32 'out.flac creates a 24 bit FLAC.

I did a bit of research and actually when using s32 we do expect 24 bit files (padded to 32 bits) from the FLAC encoder

I could successfully create S16 WAV files with ffmpeg only.

Weird, I do have 24 bit and 32 bit WAV files on my library!

Well, like said, I*ll add it but it will take some time :)

Sure thing!

nschlia commented 2 years ago

Weird, I do have 24 bit and 32 bit WAV files on my library!

Not weird, WAV can have 8 to 64 bits sample size:

8-bit 16-bit 20-bit 24-bit 32-bit 64-bit

16.8 floating point 24.0 floating point 32-bit floating point 64-bit floating point

To create them with FFmpeg, you have to select the correct codec, not the sample format (there is always only one for each of these codecs, very confusing :)

Example: ffmpeg -i 'in.flac' -c:a pcm_s64le out-s64.wav ffprobe reports this on the out file: Stream #0:0: Audio: pcm_s64le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s64, 5644 kb/s So you can create 64 bit WAVs or 8 bit, or whatever.

nschlia commented 2 years ago

I have fiddled about with the sample format, and it turned out that setting the sample_fmt parameter not really does what we would like to happen.

First of all, most formats only support flt. Comtainer Format Sample Format
AIFF s16
ALAC s32p, s16p
MOV fltp
MP3 s32p, fltp, s16p
mp4 fltp
OGG fltp
Opus s16, flt
Prores s16
TS fltp
wav s16
Webm s16, flt

This makes things rather awkward. For WAV, I need to select the proper codec, For others, I have to select flt and probably set bits_per_raw_sample (need to check if that works). For formats that do not support flt (ALAC), I would have to select s16, s32 etc. For ProRes, that only has s16 support, there is probably no way to do anything.

This little option appears to be quite a big thing :)

nschlia commented 2 years ago

Note:

I just created two mp3s with ffmpeg:

32 bit sample format: ffmpeg -i 'in.flac' -sample_fmt s32p out-s32p.mp3 This is what it reports during creation: Stream #0:1: Audio: mp3 (libmp3lame), 44100 Hz, stereo, s32p (16 bit)

16 bit sample format: ffmpeg -i 'in.flac' -sample_fmt s16p out-s16p.mp3 This is what it reports during creation: Stream #0:1: Audio: mp3 (libmp3lame), 44100 Hz, stereo, s16p

Both result files have the same size and are binary the same. Note the hint "s32p (16 bit)".

I am afraid the parameter could work on WAV only... Or probably on FLAC, but I*d have to add that format, though.

nschlia commented 2 years ago

Finally I understood the whole problem. Bit depth, or "sample format", does not apply to lossy formats. The encoding process loses this information. You can feed s32 into the mp3 encoder, for example, but the encoded file does not contain this detail. In fact, encoding a 24 bit source can result in a better quality mp3 than an 16 bit original, but you won*t be able to determine that from the encoded file.

Thus, lossless formats only (e.g. WAV, AIFF, ALAC and FLAC) will keep that information.

Means, that the current implementation in FFmpegfs will always create 16 bit AIFF or WAV, up- or downgrading the source if necessary. ALAC files will be 16 or 32 bit, depending on the source (a 32 or 16 bit source will create a 32 or 16 bit ALAC, respectively. 8 bit will be upgraded to 16 bit as ALAC does not suppord 8 bit).

So for you, creating WAVs/AIFFs will always result in CD quality (16 bit) files (provided you limit the sample rate to 44.1 kHz with --audiosamplerate=44.1K).

I think I will add extra destination types, like WAV8, WAV32 or ALAC16/32 to allow selecting a bit width for these types (#106). Possibly I could add FLAC that at least allows 16 or 24 bit. See #105.

But for all lossy formats, MP3, MP4, WebM etc. the option does not apply.

nschlia commented 2 years ago

After some research, I decided to add this parameter instead of #105. The possible settings are:

--audiosamplefmt=SAMPLEFMT, -o audiosamplefmt=SAMPLEFMT

    Set sample format. SAMPLEFMT can be:

    0 to use the predefined setting, 8, 16, 32, 64 for integer
    format, F16, F32, F64 for floating point

    Not all formats are supported by all destination types, selecting
    an invalid format for will be reported as error and a list of
    values printed. The target codec may choose a different value,
    e.g., setting 32 bit may create a 24 bit file in some cases.

    Default: 0 (The predefined format of the destination).

This option is only available for lossless formats, but not all formats support all options:

Container Format Sample Format
AIFF 0, 16, 32
ALAC 0, 16, 24
wav 0, 8, 16, 32, 64, F16, F32, F64
FLAC 0, 16, 24

Invalid combinations will be reported as command line error.