Open devnoname120 opened 1 year ago
Phew, thank you for this huge and detailed investigation.
I (personally) do not have ANY use case for this - fading in does indeed modify the audio in a way I never would like to have it. Furthermore I don't think this is really an issue... more like a detailed guide to achieve something.
The --ffmpeg-param
thing was a quick and dirty approach to provide some extended feature, but it was a really, REALLY bad idea. It causes more issues than it solves in my opinion.
What I should have done instead was to provide a small plugin api to modify commands before they are getting executed. Example:
// my-plugin.php
m4btool_register_command_plugin(function(array $command, CommandContext $context) {
if(in_array("ffmpeg", $command, true)) {
return $command;
}
// modify command as you wish
// ....
// then return it
return $command
});
And then running
m4b-tool merge --command-plugin="my-plugin.php" ....
What do you think? Would this be better for your use case?
Hmm I think that a plugin API would still have a learning curve and wouldn't be very convenient for one-off solutions. Just like with the --ffmpeg-param
you would need to understand which commands m4b-tools
runs and in which order. You'd additionally have to figure out how you should patch the array making sure that you only apply the changes at the right steps of the process.
A plugin API could definitely be useful if you plan on welcoming plugin contributions. But then it would require substantial effort to maintain these plugins considering that they would patch the command (not necessarily nicely in nice and future-proof ways).
In my case the hardest was to figure out where/how the ffmpeg commands were built, that --ffmpeg-param
didn't behave the way I assumed it would, and finally deciding that it would just be less effort to add an echo
right before the ffmpeg commands get executed so that I can just grab the commands and modify them manually.
I think that a great starting point would be to print the ffmpeg
commands that m4b-tool
runs (maybe by default to make them easier to discover). People who want custom behaviors could just use --dry-run
, modify the ffmpeg commands, and manually run them. If they want to contribute the feature back to m4b-tool
they can add a new option and do a PR.
What do you think?
I think that a great starting point would be to print the ffmpeg commands that m4b-tool runs (maybe by default to make them easier to discover). People who want custom behaviors could just use --dry-run, modify the ffmpeg commands, and manually run them. If they want to contribute the feature back to m4b-tool they can add a new option and do a PR.
Oh that is easy. Just use --debug
. Maybe it would be nice to have ONLY the commands printed, so an option with --command-logfile
or something may be the solution for this.
@sandreas Does it also work with FDK AAC? iirc the command was built differently but I'm not sure if the debug log works anyway or not.
I spent quite a bit of time and attempts to figure out how to add a fade in/out effect between MP3s merged into an M4B. I share my solution here for future visitors. Note that this solution could easily be natively integrated in
m4b-tool
but my schedule is very busy and unfortunately I don't have the bandwidth to do a pull request.My requirements:
ffmpeg -f concat -f copy
run bym4b-tool
is required to preserve them.My solution:
Note: For the conversion step I directly use a FFmpeg command (
ffmpeg -i {} -f lavfi -i […]
) instead ofm4b-tool
for two reasons: 1)m4b-tool
silently ignores the--ffmpeg-param
for the Fraunhofer FDK AAC (libfdk_aac
) codec (!) becausem4b-tool
directly runsffmpeg
instead of using theFfmpeg.php
executable abstraction.--ffmpeg-param
is properly applied when using the native FFmpeg AAC Encoder (aac
) codec. I use the Fraunhofer FDK AAC codec as it has a better encoding quality for a given bitrate compared to the native aac encoder. 2) The--ffmpeg-param
option ofm4b-tool
indiscriminately applies to both the conversion step (when using the native FFmpeg AAC Encoder) and the merge step (no matter what). This is due to the fact that they both use theFfmpeg.php
executable abstraction.-f concat -c copy
used bym4b-tool
for the merge aren't compatible with FFmpeg filters. Removing these options would both force a re-encoding (which degrades the sound quality) and drop the individual metadata of each converted file (they are preserved thanks to the-f concat -c copy
options).Explanations: The interesting parts are the following options in the first line. They add a fade-in + fade-out effect losslessly without an extra re-encoding step thanks to a filtergraph:
Detailed break down for the curious:
ffmpeg
is provided with two stream inputs:-i {}
.-f lavfi
) that just inputs silent audio (-i anullsrc
). Check Step 3 to see why we need it.[0]afade=t=in:d=1:curve=tri[a]
adds a fade-in effect at the start of the decoded file.[0]
is used as the input of theafade
filter command. It corresponds to the first input passed to FFmpeg. Note that we can't use this filter for the fade-out at the end of the input stream as we would need to provide an absolute time offset in the stream, which we can't calculate within the filter pipeline. (Filter streams are non-rewindable and theafade
filter command doesn't support relative time offsets to the end of the stream).t=in
for a fade-in effect. Since no start timest
is specified, the effect applies at the beginning of the file.d=1
means that the fade-in effect has a total duration of 1 second.curve=tri
to select a triangular linear fade-in transition function.[a]
to direct the output of this step to a named streama
.[1]atrim=0:0.7[t]
cuts theanullsrc
virtual silent stream to last 0.7 seconds. It needs to have the same duration as the one specified by thed
parameter ofacrossfade
in the next filter.[1]
is the input of theatrim
filter command. This corresponds to the second input passed to FFmpeg, hereanullsrc
.0:0.7
is the trim window. Here the trim will only keep the 0.7 seconds of the silentanullsrc
stream.[a][t]acrossfade=d=0.7:o=1:c1=tri:c2=nofade
adds a cross fade effect at the end of the decoded stream[a]
+ start of the second stream[t]
. I use a trick (detailed below) to make it only add a fade-out effect at the end of[a]
without changing its duration.[a]
is used as the first input ofacrossfade
. It corresponds to the output of the first step i.e. the decoded file stream with a fade-in effect at the start.[t]
is used as the second input ofacrossfade
. It corresponds to the output of the trim in the second step i.e. a silent stream with a duration of 0.7 seconds.d=0.7
is the duration of the fade-out effect. It's important for it to be equal to theatrim
length of Step 2.atrim
step then the effect won't be applied at all (the second input stream needs to have a duration that is at least as long as the crossfade effect).atrim
step, then a silence is added to the end of the output stream. We don't want that to increase the duration of the stream and add a silent section at the end, but instead only add a fade-out effect.o=1
means that the two streams should overlap during the cross-fade (fade out the first stream and fade in the second stream at the same time). This is the main trick of thisfilter_complex
pipeline.[a]
fades out while[t]
fades in at the same time.[a]
fade-out is done, the[t]
silent stream fade-in is also over (because we trim it to 0.7s which is also the cross-fade duration).[t]
is silent so the fade-in of[t]
doesn't affect the output stream (a no-op).c1=tri
to select a triangular linear fade-out transition for the first stream.c2=nofade
to select an identity curve for the fade-in transition of the second stream. The choice of this curve shouldn't matter at the[t]
stream is silent anyway.acrossfade
is the output of the whole filter pipeline.