Missing Binaural RIRs - Githubissues

sreeharshaparuchur1 commented 1 year ago

Hi @ChanganVR,

Is the Binaural RIR data present at this link only a subset of the complete dataset (Ref. Issue #70 or are you referring to something else in this issue)?

The Soundspaces 1.0 paper says that there are $N^{2}$ RIRs computed for a grid of $N$ navigable points. In Issue #11, you said that the graph.pkl file has all of the navigable points that the agent can reach, thus, I have considered len(graph.nodes()) to be $N$.

While the number of ix1_idx2.wav files $M$ are equal for all orientations in each scene, I have observed that $N^{2} \neq M$. Infact, in most cases, $N^{2} > M$. Only for 8 scenes have I found these values to be equal.

To investigate further, I have plotted the 'regions' where the number of RIR files idx_*.wav for index idx are equal and have colour coded them. You can see this segregation in the below images:

As visible, the above scene has 4 'regions'. The scene below has 10 such regions:

An instance of a scene where $N^{2} == M$ is:

To uncover the number of scenes with such 'missing data', I plotted the number of scenes with x regions versus x regions:

Clearly, there are a lot scenes with missing data. Thus, I request you to upload the entire dataset or let me know why this design choice was made and how evaluation of AV_WAN, AV_NAV and SAVI was possible with this incomplete dataset (I was able to run these baseline models and achieve accuracies similar to those mentioned in the respective papers )

Secondly, as visible in the above diagrams where I've overlaid the points from graph.pkl onto the navmesh, I observe some points lying in non-navigable regions, why is this the case? I have tried varying the height at which i slice the navmesh but the results are inconsistent.

Thank you for your help.

sreeharshaparuchur1 commented 1 year ago

@ChanganVR , kindly reply to this query as soon as possible.

TIA

ChanganVR commented 1 year ago

Hi @sreeharshaparuchur1 sorry for missing this issue.

Yes, this binaural data there in the link is partial data. This is because a lot of binaural data are not usable, for example, points on some furniture or in the wall, etc. Therefore, I pruned the graph to small subgraphs where all points are connected and navigable. I preserved all binaural files for these navigable points inside a subgraph. Without the pruning, the data size is about 7-8 TB. And with pruning, it reduces to hundreds of GBs, which is more reasonable for people to use.

^ also answers your second question. When we uniformly sample the points on a grid, some points will fall on non-navigable reasons inevitably.

Let me know if you have more questions.

sreeharshaparuchur1 commented 1 year ago

@ChanganVR ,

You previously told me (in issue #11) that the 'uniformly sampled' points are in points.txt and that after pruning, you are left with the points present in graph.pkl wherein all the points are to be navigable. The figures above plot the nodes in graph.pkl yet there are points in non-navigable regions, I am confused as to why this is the case.

Additionally, the point pair that I am querying for are 'valid' points as indicated by graph.pkl.

The scene in this example is pa4otMbVnkk

The Binaural RIR being queried is /270/277_220.wav

Here are the indices (index 2 is in the bottom most point out of the two ), individually marked by a Blue Cross (X), are overlaid on the grid of points given in graph.pkl (the black dots). As they lie in the grey region, I would assume that they are navigable points. Thus, it is surprising that the RIR file for this pair (and similar pairs) is missing.

Furthermore, if either of these points were non-navigable, then searching for the RIRs with a naming convention along the lines of 'idx_*.wav' should be an empty set, but it is not.

ChanganVR commented 1 year ago

Sorry I might not be very clear earlier. There is an RIR preserved if there is a path between the two points, and you can check if there is a path with nx.shortest_path(graph, source=index_a, target=index_b).

Further, you can print out all the subgraphs with list(nx.connected_components(graph)). Within each subgraph, pairwise RIRs are preserved, otherwise, they are pruned.

sreeharshaparuchur1 commented 1 year ago

Thanks for your answer. I spent some time running tests on the dataset to make sure I fully understand what you mean and below are my observations summarized:

There are 'connected components' in a single floor level. Say this component is a set that has N points; there will be $N^{2}$ RIRs preserved for the points in this set.
The 'connected component' set consisting of $N$ points are fully connected - i.e. any point in this set is reachable from another point in this set and this set only.
The reason for dividing a scene into these 'regions' is because not every point is navigable from another point in a scene (say in MP3D, the width of a doorway is 80cm but as the RIRs are sampled 1 meter apart from each other, an agent may be confined to navigate within the room it was spawned in, thus, making it redundant to preserve RIRs between two disjoint sets of 'connected components'.

Kindly confirm the same when possible

ChanganVR commented 1 year ago

@sreeharshaparuchur1 yes! they are correct!

sreeharshaparuchur1 commented 1 year ago

Thanks a lot for your help and patiently answering my queries :)

facebookresearch / sound-spaces

Missing Binaural RIRs #104