JusperLee / LibriSpace

http://cslikai.cn/LibriSpace/
Other
130 stars 19 forks source link

Is it possible to generate multi-channel RIRs? #2

Open xiaoyaoxiaoxian opened 1 month ago

JusperLee commented 1 month ago

It can be completely modified to multi-channel. Please adjust the following code

https://github.com/JusperLee/LibriSpace/blob/c4f2729812b1e8b8e0e084fa19eeffe5a71cd8c5/data-script/generate_mp3d_librispeech_data.py#L159

The channel layout describes how the audio output will be experienced by the listener. Let's look at channel layout types that are currently supported.

Enum Usage
Unknown Unknown channel layout type
Mono Monaural channel layout that does not have any spatial information. This layout usually has 1 channel
Stereo Channel layout with 2 channels (e.g. speakers) that does not use any HRTF
Binaural Channel layout with 2 channels that spatializes audio using an HRTF
Quad Channel layout with 4 channels (speakers) arranged at +-30 and +-95 degrees in the horizontal plane
Surround_5_1 Channel layout with 6 channels (speakers) arranged at 0, +-30, and +-110 degrees in the horizontal plane, with unpositioned low frequency channel
Surround_7_1 Channel layout with 8 channels (speakers) arranged at 0, +-30, +-90, and +-135 degrees in the horizontal plane, with unpositioned low frequency channel
Ambisonics Channel layout that encodes fully spherical spatial audio as a set of spherical harmonic basis function coefficients
xiaoyaoxiaoxian commented 1 month ago

It can be completely modified to multi-channel. Please adjust the following code

https://github.com/JusperLee/LibriSpace/blob/c4f2729812b1e8b8e0e084fa19eeffe5a71cd8c5/data-script/generate_mp3d_librispeech_data.py#L159

The channel layout describes how the audio output will be experienced by the listener. Let's look at channel layout types that are currently supported.

Enum Usage Unknown Unknown channel layout type Mono Monaural channel layout that does not have any spatial information. This layout usually has 1 channel Stereo Channel layout with 2 channels (e.g. speakers) that does not use any HRTF Binaural Channel layout with 2 channels that spatializes audio using an HRTF Quad Channel layout with 4 channels (speakers) arranged at +-30 and +-95 degrees in the horizontal plane Surround_5_1 Channel layout with 6 channels (speakers) arranged at 0, +-30, and +-110 degrees in the horizontal plane, with unpositioned low frequency channel Surround_7_1 Channel layout with 8 channels (speakers) arranged at 0, +-30, +-90, and +-135 degrees in the horizontal plane, with unpositioned low frequency channel Ambisonics Channel layout that encodes fully spherical spatial audio as a set of spherical harmonic basis function coefficients

Thanks for you reply. This param seems to support multi-speakers for replay, but what I actually want is , RIRs for an arbitrary microphone array(such as 3-ch RIRs for 3-mic array)

JusperLee commented 1 month ago

This is the type of mic. Then if you want to customize a linear array or a circular array, you need to rewrite the rir splicing by yourself and use a single microphone.

The following is the sample code:

linear_mic_array = [
        [0, 0, 0],
        [0, 0, 0.04],
        [0, 0, 0.12],
        [0, 0, 0.16]
    ]

    circular_mic_array = [
        [0, 0, -0.035],
        [0.035, 0, 0],
        [0, 0, 0.035],
        [-0.035, 0, 0]
    ]
for mic_idx, mic in enumerate(mic_array):
            agent_state.position = init_state.position + mic
            agent.set_state(agent_state, True)
            audio_sensor.setAudioSourceTransform(point)
            ir = np.array(sim.get_sensor_observations()["audio_sensor"])
            multi_channels.append(process_ir(ir))
xiaoyaoxiaoxian commented 3 weeks ago

This is the type of mic. Then if you want to customize a linear array or a circular array, you need to rewrite the rir splicing by yourself and use a single microphone.

The following is the sample code:

linear_mic_array = [
        [0, 0, 0],
        [0, 0, 0.04],
        [0, 0, 0.12],
        [0, 0, 0.16]
    ]

    circular_mic_array = [
        [0, 0, -0.035],
        [0.035, 0, 0],
        [0, 0, 0.035],
        [-0.035, 0, 0]
    ]
for mic_idx, mic in enumerate(mic_array):
            agent_state.position = init_state.position + mic
            agent.set_state(agent_state, True)
            audio_sensor.setAudioSourceTransform(point)
            ir = np.array(sim.get_sensor_observations()["audio_sensor"])
            multi_channels.append(process_ir(ir))

Thank you . I will have a try.