LCAV / pyroomacoustics

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.
https://pyroomacoustics.readthedocs.io
MIT License
1.43k stars 427 forks source link

AttributeError: 'numpy.ndarray' object has no attribute 'images #123

Closed xiaojian10 closed 2 years ago

xiaojian10 commented 5 years ago
  1. I found that in your example code, the weight of the beamformer is not consistent with the definition. for example: image But in the definition : image

  2. And then, I want to use the method you defined to find the weight, there are some errors, I tried a lot of ways to solve it, but all failed. image image image If I define some of the parameters in the weights as you would in the sample code. Then the code will not report an error, but the voice after beamforming has no sound. I don't know what the reason for doing this is? Can you explain it to me in your spare time?

fakufaku commented 5 years ago

Dear @xiaojian10 , before I try to answer the questions, just a few points. To make it easier to answer your question, it is best to

  1. Type a full example of the code that causes the issue. This is preferably a minimum example needed to reproduce the issue, but you can also just put all your code.
  2. Do not use screenshots, if you do and I want to try to run the problematic code, I need to re-type it myself, which is inconvenient.

Now, without seeing the code, I am only guessing, but the first error might happen if you don't run the room simulation, then the list of images will not be created and this could cause an error as reported.

The second issue about the output having no sound could be for a lot of different reasons, but a common enough cause is just that the amplitude of the signal is very small. Usually, before saving to wav, you want to normalize the signal between [-1, 1], if saving to float, or [-2**15, 2**15], if saving to int16.

xiaojian10 commented 5 years ago

thank you very much for your suggestion. My code is as follows: Operating environment: jupyter-notebook Python : py3.5

import numpy as np
import matplotlib.pyplot as plt
from scipy.io import wavfile
from scipy.signal import fftconvolve
import IPython
import pyroomacoustics as pra
# specify signal and noise source
fs, signal = wavfile.read("arctic_a0010.wav")
fs, noise = wavfile.read("exercise_bike.wav")  # may spit out a warning when reading but it's alright!
# Create 6x6 shoebox room with source and interferer and simulate
Lg_t = 0.100                # filter size in seconds
Lg = np.ceil(Lg_t*fs)       # in samples
room_bf = pra.ShoeBox([6,6], fs=fs, max_order=12)
source = np.array([3, 1])
interferer = np.array([3, 5])
room_bf.add_source(source, delay=0., signal=signal)
room_bf.add_source(interferer, delay=0., signal=noise[:len(signal)])
# Create geometry equivalent to Amazon Echo
center = [3, 3]; radius =0.027
fft_len = 512; M = 4;
echo = pra.circular_2D_array(center=center, M=4, phi0=0, radius=radius)
echo = np.concatenate((echo, np.array(center, ndmin=2).T), axis=1)
#pra.beamforming.circular_2D_array(center, M, phi0=0, radius=radius)
mics = pra.Beamformer(echo, room_bf.fs, N=fft_len, Lg=Lg)
room_bf.add_microphone_array(mics)
R_n = (M * Lg)*(M * Lg)
# Compute  weights
#mics.rake_delay_and_sum_weights(room_bf.sources[0][:1])
#mics.rake_delay_and_sum_weights(source,interferer, R_n=None, attn=True, ff=False)
mics.rake_max_sinr_weights(room_bf.sources[0][:1])
#mics.rake_max_sinr_weights(source, interferer, R_n, rcond=0.0, ff=False, attn=True)
room_bf.compute_rir()
room_bf.simulate()
# plot the room and resulting beamformer before simulation
fig, ax = room_bf.plot(freq=[500, 1000, 2000, 4000], img_order=0)
ax.legend(['500', '1000', '2000', '4000'])
fig.set_size_inches(20, 8)

The above code is your sample code, the beamformer you are using is DAS, but I want to use other types of beamformers. I understand that if you transform the beamformer, you only need to change the weight of the beamformer,and maybe my understanding is wrong. You can try running the above code, and the same error as before will occur. Can you answer my confusion in your spare time?

fakufaku commented 5 years ago

One more tip 😄 : you can use code block formatting to enhance readability. I have updated your reply above in this way.

Now, I have tried to run the above code sample and it doesn't produce the same error for me. I see above that you are using a notebook. Notebooks are well-known for their hidden state (because of splitting the code in cells run in arbitrary order), so maybe that was the cause ? You could try to paste all this into a script and run it from the command line. Here is the slightly modified code that produces the expected output from the example.

import numpy as np
import matplotlib.pyplot as plt
from scipy.io import wavfile
from scipy.signal import fftconvolve
import IPython
import pyroomacoustics as pra

# specify signal and noise source
# >> Modified to use samples available in pyroomacoustics examples
fs, signal = wavfile.read("examples/input_samples/cmu_arctic_us_aew_a0001.wav")
fs, noise = wavfile.read("examples/input_samples/doing_the_dishes.wav")

# Create 6x6 shoebox room with source and interferer and simulate
Lg_t = 0.100                # filter size in seconds
Lg = np.ceil(Lg_t*fs)       # in samples
room_bf = pra.ShoeBox([6,6], fs=fs, max_order=12)
source = np.array([3, 1])
interferer = np.array([3, 5])
room_bf.add_source(source, delay=0., signal=signal)
room_bf.add_source(interferer, delay=0., signal=noise[:len(signal)])

# Create geometry equivalent to Amazon Echo
center = [3, 3]; radius =0.027
fft_len = 512; M = 4;
echo = pra.circular_2D_array(center=center, M=4, phi0=0, radius=radius)
echo = np.concatenate((echo, np.array(center, ndmin=2).T), axis=1)
mics = pra.Beamformer(echo, room_bf.fs, N=fft_len, Lg=Lg)
room_bf.add_microphone_array(mics)
R_n = (M * Lg)*(M * Lg)

# Compute  weights
# >> This function also requires the R_n parameter when there is only one source
mics.rake_max_sinr_weights(room_bf.sources[0][:1], R_n=np.eye(mics.M))
room_bf.compute_rir()
room_bf.simulate()

# plot the room and resulting beamformer before simulation
fig, ax = room_bf.plot(freq=[500, 1000, 2000, 4000], img_order=0)
ax.legend(['500', '1000', '2000', '4000'])
# >> I was getting an empty figure and had to get rid of the line below
# fig.set_size_inches(20, 8)  
plt.show()