IoSR-Surrey / untwist

Other
130 stars 31 forks source link

RPCA should return Spectrograms #2

Closed ghost closed 7 years ago

ghost commented 7 years ago

Greetings,

I am trying to follow your rpca_example.py and at M = RatioMask(np.abs(S),np.abs(L)) or M = RatioMask(S,L)

I get:

421     def __new__(cls, target, background, p = 1):
422         tm = target.magnitude() + eps

--> 423 bm = background.magnitude() + eps 424 mask = (tmp / (tm + bm)p).astype(types.float) 425 instance = TFMask.new(cls, mask)

AttributeError: 'numpy.ndarray' object has no attribute 'magnitude'

Installed untwist via conda, on MacOS Sierra 10.12.4. The error above I do not get, since the previous line (L,S) = rpca.process(X.magnitude()) run fine?

Any help is greatly appreciated

deeuu commented 7 years ago

Hello,

I've just run the same example (rpca_example.py) with an appropriate wav file and have no problems.

From the error:

AttributeError: 'numpy.ndarray' object has no attribute 'magnitude'

My guess is that S and/or L are not of type Spectrogram (from untwist.data.audio), which they should be for input to a RatioMask.

Can you post the exact code you are having problems with?

Cheers

Dingdong8187 commented 7 years ago

I'm facing the same issue. Didn't change the code, just input a wav file containing mixture of repetitive music and a person talking, as specified in the example.

Error:

Traceback (most recent call last): File "/home/usama/.config/spyder/temp.py", line 18, in M = RatioMask(S, L) File "/home/usama/anaconda2/lib/python2.7/site-packages/untwist/data/audio.py", line 423, in new bm = background.magnitude() + eps AttributeError: 'numpy.ndarray' object has no attribute 'magnitude'

deeuu commented 7 years ago

Hi,

Please share your code (even if it is identical to the example) and I'll take another look.

Dingdong8187 commented 7 years ago

This is the code:

import numpy as np
import matplotlib.pyplot as plt
from untwist.data import Wave, RatioMask
from untwist.transforms import STFT, ISTFT
from untwist.factorizations import RPCA

stft = STFT()
istft = ISTFT()
rpca = RPCA(iterations = 100)

# Try with vocals over repetitive music background
x = Wave.read("/home/usama/mix3.wav")
X = stft.process(x[:,0])

# this will take some time
(L,S) = rpca.process(X.magnitude())

M = RatioMask(S, L)
v = istft.process(X * M)
v.write("vocal_estimate.wav")

plt.subplot(4,1,1)
X.plot(label_x = False, title = "mixture")
plt.subplot(4,1,2)
L.plot(label_x = False, title = "L")
plt.subplot(4,1,3)
S.plot(label_x = False, title = "S")
plt.subplot(4,1,4)
M.plot(title="estimated mask")
plt.show()
Dingdong8187 commented 7 years ago

This is the wav file I'm using:

mix3.wav

deeuu commented 7 years ago

Sorry for the delay.

I've updated the master branch and rpca_example.py should now run successfully.

@stefanokalonaris Turns out I was testing on the development branch.

Dingdong8187 commented 7 years ago

Hi, Sorry, but same error as before. It was able to calculate target.magnitude() but not background.magnitude() so I was thinking that the error occurred because rpca.process() returned an empty 'L', but that is not the case.

Tried to check with a different audio file (mix of a person talking and a sine wave): mix5.wav.zip

But no luck.

Dexter123193 commented 7 years ago

Hi. I had some questions regarding RPCA. Why does the RPCA script gives only one sound as an output, even when a wav file of more than one sounds is passed ? And whenever I pass a .wav file it's vocal_estimated is the same as an input. Any help would be appreciated. Regards.

deeuu commented 7 years ago

@Dingdong8187 Hi, Have you reinstalled the package from master? I can successfully run the following snippet on a fresh install of Python3.5 using the first 3 seconds of your mix3.wav (input to RPCA must be mono):

import numpy as np
import matplotlib.pyplot as plt
from untwist.data import Wave, RatioMask
from untwist.transforms import STFT, ISTFT
from untwist.factorizations import RPCA

stft = STFT()
istft = ISTFT()
rpca = RPCA(iterations = 100)

# Try with vocals over repetitive music background
x = Wave.read("mix3.wav")
X = stft.process(x[:3*44100, 0])

# this will take some time
(L,S) = rpca.process(X.magnitude())

M = RatioMask(S, L)
deeuu commented 7 years ago

@Dexter123193 Hi, If you run the example above, the output is not perceptually identical to the input.

Regarding the number of estimates, you have the background music and the vocals (see here); our example just writes the latter out to a wav file.

Either way, we could do with a good working example for this processor using an appropriate audio file (@g-roma).

@Dexter123193 Please create a new issue if you have further questions as I'm about to close this one.

Cheers

Dingdong8187 commented 7 years ago

Thank you for the help, the example is working now.