faroit / stempeg

Python I/O for STEM audio files
https://faroit.github.io/stempeg
MIT License
96 stars 13 forks source link

Stems write - Format not recognised #19

Closed shoegazerstella closed 5 years ago

shoegazerstella commented 5 years ago

Hello,

As you stated in the documentation the stems write doesn't always work well. I am using this command with ffmpeg to create a STEM file:

ffmpeg -i ~/mix.wav -i ~drums.wav -i ~/vocals.wav -map 0 -map 1 -map 2 -c:a libfdk_aac -metadata:s:0 title=mix -metadata:s:1 title=drums -metadata:s:2 title=vocals ~/output.stem.mp4

I then tried to read it back using the musdb library and it works well. I was wondering if this could be included in your library to finally make it work properly.

I unfortunately do not have much time to work more on this and ask for a pull request but I made a simple implementation if could be of any help. Also check this homebrew-ffmpeg if the right codecs are not installed properly in the official ffmpeg distribution.

faroit commented 5 years ago

thanks for your contributions. I don't really see a difference between the arguments you use and the one used in write.py. Here is an example what is actually called by stempeg:

ffmpeg -y -f s16le -acodec pcm_s16le -ar 44100 -ac 2  -i mix.wav -i vocals.wav -i drums.wav -i bass.wav -i other.wav -map 0 -map 1 -map 2 -map 3 -map 4 -vn -acodec aac -ar 44100 -strict -2 -loglevel error -ab 256000 stems.mp4

maybe you can point me to the relevant changes one would need to make to improve the write functions?

shoegazerstella commented 5 years ago

The difference should be in the codec. I'm trying this example but then there must be something wrong in the way I use stempeg, but I can't understand what:

y, sr = librosa.load('STEMS/mix.wav')

# CREATE STEMS
which_stem = 1

if which_stem == 1:

    # stems x samples x channels
    y_0 = y.reshape((-1, 1))
    y_1 = np.zeros((y.shape[0], 1))

else:
    # stems x channel x samples
    y_0 = y.reshape((1, -1))
    y_1 = np.zeros((1, y.shape[0]))

# CREATE TENSOR
S = np.array([y_0, y_1])

# WRITE STEMS
output_wav = 'STEMS/stem_OUT.mp4'
stempeg.write_stems(S, output_wav, rate=sr)

For some reason, using stems x samples x channels, the generated .mp4 file runs twice the speed it should be, but the single .wav files are correct. The stems x channel x samples, which should be the format required by stempeg.write, results in an error, as the vector has wrong size.

Error opening '/var/folders/zk/xlbb09bs2j3_wrxkj0fwbhvc0000gn/T/tmp2z3t3wiv.wav': Format not recognised.

Do you see any error in what I am doing? Thanks a lot!

faroit commented 5 years ago

The difference should be in the codec.

you mean the output codec (acodec)? I also check if libfdk_aac is available and use it. So that doesn't look to be different.

https://github.com/faroit/stempeg/blob/ebbaec87ea440fcbb06423d708e7847749e63d38/stempeg/write.py#L79-L84

For some reason, using stems x samples x channels, the generated .mp4 file runs twice the speed it should be, but the single .wav files are correct. Do you see any error in what I am doing?

Are you writing mono files? The STEMS format is stereo only, that is why its hardcoded here. I might need to add a check for this to warn users...

shoegazerstella commented 5 years ago

Yes in this case is mono. I understand, thanks a lot for the clarification.

shoegazerstella commented 5 years ago

Hi, sorry if I am reopening the issue.

I am saving the audio setting 2 channels as:

def save_stereo(filename):
    sound = AudioSegment.from_wav(filename)
    sound = sound.set_channels(2)
    sound.export(filename, format="wav", bitrate="256k", parameters=["-ac", "2"])
    return

Then I am collecting the stems and creating S but the error still pops up. I am making sure nothing goes wrong in the numpy array when saving and when I listen back to it. It sounds okay.

mix: (2, 13340160) 
drums: (2, 13340160) 
bass: (2, 13340160) 
accompaniment: (2, 13340160) 
vocals: (2, 13340160)

S.shape: (5, 2, 13340160)
stempeg.write_stems(S, output_mp4, rate=sr_mix)

Error opening '/var/folders/zk/xlbb09bs2j3_wrxkj0fwbhvc0000gn/T/tmpwqt3zrer.wav': Format not recognised.

I also have these warnings. Could the error above be related to those?

/usr/local/lib/python3.7/site-packages/stempeg/write.py:84: UserWarning: For better quality, please install libfdc_aac
  warnings.warn("For better quality, please install libfdc_aac")
/usr/local/lib/python3.7/site-packages/stempeg/write.py:96: UserWarning: Number of samples does not divide by 1024, be aware that the AAC encoder add silence to the input signal
  "Number of samples does not divide by 1024, be aware that "
shoegazerstella commented 5 years ago

[UPDATE] Correct me if I am wrong, could It be related to https://github.com/bastibe/SoundFile/issues/203 ? So stems x samples x channels should be the right format.

faroit commented 5 years ago

has this issue been addressed?

shoegazerstella commented 5 years ago

Sorry, I solved using the stems x samples x channels ordering. Closing now.