Closed sreeharshaparuchur1 closed 1 year ago
@ChanganVR , kindly reply to this query as soon as possible.
TIA
Hi @sreeharshaparuchur1 sorry for missing this issue.
Yes, this binaural data there in the link is partial data. This is because a lot of binaural data are not usable, for example, points on some furniture or in the wall, etc. Therefore, I pruned the graph to small subgraphs where all points are connected and navigable. I preserved all binaural files for these navigable points inside a subgraph. Without the pruning, the data size is about 7-8 TB. And with pruning, it reduces to hundreds of GBs, which is more reasonable for people to use.
^ also answers your second question. When we uniformly sample the points on a grid, some points will fall on non-navigable reasons inevitably.
Let me know if you have more questions.
@ChanganVR ,
You previously told me (in issue #11) that the 'uniformly sampled' points are in points.txt and that after pruning, you are left with the points present in graph.pkl wherein all the points are to be navigable. The figures above plot the nodes in graph.pkl yet there are points in non-navigable regions, I am confused as to why this is the case.
Additionally, the point pair that I am querying for are 'valid' points as indicated by graph.pkl.
The scene in this example is pa4otMbVnkk
The Binaural RIR being queried is /270/277_220.wav
Here are the indices (index 2 is in the bottom most point out of the two ), individually marked by a Blue Cross (X), are overlaid on the grid of points given in graph.pkl (the black dots). As they lie in the grey region, I would assume that they are navigable points. Thus, it is surprising that the RIR file for this pair (and similar pairs) is missing.
Furthermore, if either of these points were non-navigable, then searching for the RIRs with a naming convention along the lines of 'idx_*.wav' should be an empty set, but it is not.
Sorry I might not be very clear earlier. There is an RIR preserved if there is a path between the two points, and you can check if there is a path with nx.shortest_path(graph, source=index_a, target=index_b)
.
Further, you can print out all the subgraphs with list(nx.connected_components(graph))
. Within each subgraph, pairwise RIRs are preserved, otherwise, they are pruned.
Thanks for your answer. I spent some time running tests on the dataset to make sure I fully understand what you mean and below are my observations summarized:
There are 'connected components' in a single floor level. Say this component is a set that has N
points; there will be $N^{2}$ RIRs preserved for the points in this set.
The 'connected component' set consisting of $N$ points are fully connected - i.e. any point in this set is reachable from another point in this set and this set only.
The reason for dividing a scene into these 'regions' is because not every point is navigable from another point in a scene (say in MP3D, the width of a doorway is 80cm but as the RIRs are sampled 1 meter apart from each other, an agent may be confined to navigate within the room it was spawned in, thus, making it redundant to preserve RIRs between two disjoint sets of 'connected components'.
Kindly confirm the same when possible
@sreeharshaparuchur1 yes! they are correct!
Thanks a lot for your help and patiently answering my queries :)
Hi @ChanganVR,
Is the Binaural RIR data present at this link only a subset of the complete dataset (Ref. Issue #70 or are you referring to something else in this issue)?
len(graph.nodes())
to be $N$.While the number of
ix1_idx2.wav
files $M$ are equal for all orientations in each scene, I have observed that $N^{2} \neq M$. Infact, in most cases, $N^{2} > M$. Only for 8 scenes have I found these values to be equal.To investigate further, I have plotted the 'regions' where the number of RIR files
idx_*.wav
for indexidx
are equal and have colour coded them. You can see this segregation in the below images:As visible, the above scene has 4 'regions'. The scene below has 10 such regions:
An instance of a scene where $N^{2} == M$ is:
To uncover the number of scenes with such 'missing data', I plotted the
number of scenes with x regions
versusx regions
:Clearly, there are a lot scenes with missing data. Thus, I request you to upload the entire dataset or let me know why this design choice was made and how evaluation of AV_WAN, AV_NAV and SAVI was possible with this incomplete dataset (I was able to run these baseline models and achieve accuracies similar to those mentioned in the respective papers )
Thank you for your help.