The transitive_closure.py script creates CSV files for both nouns and mammals sets in WordNet. These are used to train the hyperbolic embeddings in embed.py.
format = 'hdf5' if dset.endswith('.h5') else 'csv'
dset = load_adjacency_matrix(dset, 'hdf5')
The format 'hdf5' is hard coded in line 42.
In addition, the function being called: load_adjacency_matrix expects an adjacency matrix to be passed as an argument, whereas [noun,mammal]_closure.csv are both a list of edges.
To fix this we borrowed code from embed.py:
if dset.endswith('.h5'):
# existing code (mostly) ...
elif dset.endswith('.csv'):
adj_temp = {}
idx, _, _ = load_edge_list(dset, sym)
for row in idx:
x = row[0].item()
y = row[1].item()
if x in adj_temp:
adj_temp[x].add(y)
else:
adj_temp[x] = {y}
sample_size = args.sample or len(adj_temp)
sample = np.random.choice(list(adj_temp.keys()), size=sample_size, replace=False)
adj = {i: adj_temp[i] for i in sample}
else:
# do something ...
With this change, we get the same mean and MAP rank as calculated and printed in the training loop.
The
transitive_closure.py
script creates CSV files for both nouns and mammals sets in WordNet. These are used to train the hyperbolic embeddings inembed.py
.If we run the
reconstruction.py
script on a pretrained model it fails as currently the script only accepts HDF5 input: https://github.com/facebookresearch/poincare-embeddings/blob/61406b1bb180234cd34d9972d8853de2fb1a14f8/reconstruction.py#L41 https://github.com/facebookresearch/poincare-embeddings/blob/61406b1bb180234cd34d9972d8853de2fb1a14f8/reconstruction.py#L42The format 'hdf5' is hard coded in line 42.
In addition, the function being called:
load_adjacency_matrix
expects an adjacency matrix to be passed as an argument, whereas[noun,mammal]_closure.csv
are both a list of edges. To fix this we borrowed code fromembed.py
:With this change, we get the same mean and MAP rank as calculated and printed in the training loop.