facebookresearch / sound-spaces

A first-of-its-kind acoustic simulation platform for audio-visual embodied AI research. It supports training and evaluating multiple tasks and applications.
https://soundspaces.org
Creative Commons Attribution 4.0 International
345 stars 55 forks source link

Missing Binaural RIRs #104

Closed sreeharshaparuchur1 closed 1 year ago

sreeharshaparuchur1 commented 1 year ago

Hi @ChanganVR,

Is the Binaural RIR data present at this link only a subset of the complete dataset (Ref. Issue #70 or are you referring to something else in this issue)?

While the number of ix1_idx2.wav files $M$ are equal for all orientations in each scene, I have observed that $N^{2} \neq M$. Infact, in most cases, $N^{2} > M$. Only for 8 scenes have I found these values to be equal.

To investigate further, I have plotted the 'regions' where the number of RIR files idx_*.wav for index idx are equal and have colour coded them. You can see this segregation in the below images:

image

As visible, the above scene has 4 'regions'. The scene below has 10 such regions:

image

An instance of a scene where $N^{2} == M$ is:

image

To uncover the number of scenes with such 'missing data', I plotted the number of scenes with x regions versus x regions:

image

Clearly, there are a lot scenes with missing data. Thus, I request you to upload the entire dataset or let me know why this design choice was made and how evaluation of AV_WAN, AV_NAV and SAVI was possible with this incomplete dataset (I was able to run these baseline models and achieve accuracies similar to those mentioned in the respective papers )

Thank you for your help.

sreeharshaparuchur1 commented 1 year ago

@ChanganVR , kindly reply to this query as soon as possible.

TIA

ChanganVR commented 1 year ago

Hi @sreeharshaparuchur1 sorry for missing this issue.

Yes, this binaural data there in the link is partial data. This is because a lot of binaural data are not usable, for example, points on some furniture or in the wall, etc. Therefore, I pruned the graph to small subgraphs where all points are connected and navigable. I preserved all binaural files for these navigable points inside a subgraph. Without the pruning, the data size is about 7-8 TB. And with pruning, it reduces to hundreds of GBs, which is more reasonable for people to use.

^ also answers your second question. When we uniformly sample the points on a grid, some points will fall on non-navigable reasons inevitably.

Let me know if you have more questions.

sreeharshaparuchur1 commented 1 year ago

@ChanganVR ,

You previously told me (in issue #11) that the 'uniformly sampled' points are in points.txt and that after pruning, you are left with the points present in graph.pkl wherein all the points are to be navigable. The figures above plot the nodes in graph.pkl yet there are points in non-navigable regions, I am confused as to why this is the case.

Additionally, the point pair that I am querying for are 'valid' points as indicated by graph.pkl.

The scene in this example is pa4otMbVnkk

The Binaural RIR being queried is /270/277_220.wav

Here are the indices (index 2 is in the bottom most point out of the two ), individually marked by a Blue Cross (X), are overlaid on the grid of points given in graph.pkl (the black dots). As they lie in the grey region, I would assume that they are navigable points. Thus, it is surprising that the RIR file for this pair (and similar pairs) is missing.

image

Furthermore, if either of these points were non-navigable, then searching for the RIRs with a naming convention along the lines of 'idx_*.wav' should be an empty set, but it is not.

ChanganVR commented 1 year ago

Sorry I might not be very clear earlier. There is an RIR preserved if there is a path between the two points, and you can check if there is a path with nx.shortest_path(graph, source=index_a, target=index_b).

Further, you can print out all the subgraphs with list(nx.connected_components(graph)). Within each subgraph, pairwise RIRs are preserved, otherwise, they are pruned.

sreeharshaparuchur1 commented 1 year ago

Thanks for your answer. I spent some time running tests on the dataset to make sure I fully understand what you mean and below are my observations summarized:

Kindly confirm the same when possible

ChanganVR commented 1 year ago

@sreeharshaparuchur1 yes! they are correct!

sreeharshaparuchur1 commented 1 year ago

Thanks a lot for your help and patiently answering my queries :)