facebookresearch / sound-spaces

A first-of-its-kind acoustic simulation platform for audio-visual embodied AI research. It supports training and evaluating multiple tasks and applications.
https://soundspaces.org
Creative Commons Attribution 4.0 International
345 stars 55 forks source link

Inquiry on Using Custom Mesh in SS 2.0 and Related File Formats #107

Open iyeon915 opened 1 year ago

iyeon915 commented 1 year ago

Hi, @ChanganVR , I would like to seek further clarification regarding the utilization of my custom mesh in SS 2.0. I have created a quite simple mesh using the code below and generated a JSON file containing semantic labels for the mesh, such as 'floor' and 'ceiling'.

Nevertheless, it appears that these semantic labels are not functioning as expected. Despite SS being able to simulate RIRs, the absorption properties of the acoustic materials do not seem to be worked. I have modified the absorption coefficients of the acoustic materials corresponding to the semantic labels 'floor', 'ceiling', and 'wall' to 0.99 to test their functionality, but without success.

I would be grateful if you could provide guidance on the following questions: 1) How can I incorporate acoustic properties into my custom mesh?

2) Could you kindly explain the purpose of the JSON file in the mp3d data (17DRP5sb8fy)? It includes .house, _semantic.ply, and .navmesh files. To my knowledge, the PLY format does not support semantic information. I am interested in understanding the roles of .house and _semantic.ply files and the procedure for assigning semantic label information to my custom GLB mesh file.

3) Is there any example format of JSON to assign semantic information to my custom and simple GLB mesh?

Thank you for your time and assistance. I look forward to your insights on this matter.

p.s. I kindly request your understanding, as my primary expertise is in acoustics and audio, and I am fairly new to mesh-related concepts and formats. Any guidance or resources you could provide would be greatly appreciated.

## code for build mesh ##
import trimesh
import numpy as np

all_corners3d = np.array([
    [-3, -3, -3],
    [3, -3, -3],
    [3, 3, -3],
    [-3, 3, -3],
    [-3, -3, 3],
    [3, -3, 3],
    [3, 3, 3],
    [-3, 3, 3]
])

floor_faces = np.array([
    [0, 1, 2, 3],
])
ceiling_faces = np.array([
    [4, 5, 6, 7],
])
side_faces = np.array([
    [0, 1, 5, 4],
    [1, 2, 6, 5],
    [2, 3, 7, 6],
    [3, 0, 4, 7]
])

floor_mesh = trimesh.Trimesh(vertices=all_corners3d, faces=floor_faces)
ceiling_mesh = trimesh.Trimesh(vertices=all_corners3d, faces=ceiling_faces)
side_mesh = trimesh.Trimesh(vertices=all_corners3d, faces=side_faces)

mesh = floor_mesh + ceiling_mesh + side_mesh

metadata = {
    "floor": {
        "id": list(range(len(floor_faces))),
        "label": "floor",
    },
    "ceiling": {
        "id": list(range(len(floor_faces), len(floor_faces) + len(ceiling_faces))),
        "label": "ceiling",
    },
    "wall": {
        "id": list(range(len(floor_faces) + len(ceiling_faces), len(floor_faces) + len(ceiling_faces) + len(side_faces))),
        "label": "wall",
    }
}

mesh.metadata = metadata

mesh.export("test_mesh.glb")
with open("test_mesh_metadata.json", "w") as json_file:
    json.dump(mesh.metadata, json_file)
## code for RIR simulation ##
import quaternion
import habitat_sim.sim
import numpy as np
import matplotlib.pyplot as plt

backend_cfg = habitat_sim.SimulatorConfiguration()
backend_cfg.scene_id = 'test_mesh.glb'
backend_cfg.scene_dataset_config_file = 'test_mesh_metadata.json'
backend_cfg.load_semantic_mesh = True
backend_cfg.enable_physics = False

agent_cfg = habitat_sim.agent.AgentConfiguration()
cfg = habitat_sim.Configuration(backend_cfg, [agent_cfg])
sim = habitat_sim.Simulator(cfg)

audio_sensor_spec = habitat_sim.AudioSensorSpec()
audio_sensor_spec.uuid = "audio_sensor"
audio_sensor_spec.enableMaterials = False # make sure _semantic.ply file is in the scene folder
audio_sensor_spec.channelLayout.type = habitat_sim.sensor.RLRAudioPropagationChannelLayoutType.Mono
audio_sensor_spec.channelLayout.channelCount = 1
audio_sensor_spec.acousticsConfig.sampleRate = 8000
audio_sensor_spec.acousticsConfig.direct = False #in our src/mic configuration, we do not need direct sound
audio_sensor_spec.acousticsConfig.indirect = True
audio_sensor_spec.acousticsConfig.indirectRayCount = 5000
audio_sensor_spec.acousticsConfig.diffraction = True
# audio_sensor_spec.acousticsConfig.transmission = False
audio_sensor_spec.acousticsConfig.meshSimplification = True
audio_sensor_spec.acousticsConfig.globalVolume = 1.
audio_sensor_spec.acousticsConfig.threadCount = 8
audio_sensor_spec.position = [0., 0., 0.] #do not need sensor's position on agent robot

sim.add_sensor(audio_sensor_spec)
audio_sensor = sim.get_agent(0)._sensors["audio_sensor"]
audio_sensor.setAudioSourceTransform([0., 0., 0.]) #source position located at origin
audio_sensor.setAudioMaterialsJSON("utils/mp3d_material_config.json") # absorption coefficients of acoustic materials that has label floor, ceiling, and wall are 0.99

# ring-array microphone array that has 10 cm radius from source
mic_pos = np.array([
    [ 5.00000000e-02,  8.66025404e-02,  6.12323400e-18],
    [-5.00000000e-02,  8.66025404e-02,  6.12323400e-18],
    [-1.00000000e-01,  1.22464680e-17,  6.12323400e-18],
    [-5.00000000e-02, -8.66025404e-02,  6.12323400e-18],
    [ 5.00000000e-02, -8.66025404e-02,  6.12323400e-18],
    [ 1.00000000e-01, -2.44929360e-17,  6.12323400e-18]
])
obs = np.zeros((mic_pos.shape[0], 1024)) # [6 mics, rir_length:1024]

agent = sim.get_agent(0)
agent_state = agent.get_state()

for now_mic in range(mic_pos.shape[0]):
    agent_state.position = mic_pos[now_mic, :]
    agent.set_state(agent_state)
    obs[now_mic] = np.array(sim.get_sensor_observations()["audio_sensor"]).squeeze()[:rir_len]

for i in range(mic_pos.shape[0]):
    plt.subplot(mic_pos.shape[0], 1, i+1)
    plt.plot(obs[i, :])
plt.savefig('test_rir.png', dpi=300)
ChanganVR commented 1 year ago

Hi @iyeon915 sorry for the late reply. To debug this, I think the first step is to make sure that the installation is correct and that you can modify the acoustics attributes for existing supported datasets. As mentioned in the other issue (we're still investigating why the materials do not work for HM3D), the material customization should work for MP3D and Gibson. And for example, if you change the coefficients for these environments, you should see the behavior changes accordingly.

Once it works, you can modify your mesh and semantic file accordingly to be close to the format of them in order to be loaded and used by SS 2.