INK-USC / RE-Net

Recurrent Event Network: Autoregressive Structure Inference over Temporal Knowledge Graphs (EMNLP 2020)
http://inklab.usc.edu/renet/
436 stars 95 forks source link

KeyError in make_subgraph in utils.py #33

Closed nareto closed 3 years ago

nareto commented 4 years ago

I'm trying to run the code on my own data that I generated from a Neo4j graph. During training, when model.evaluate_filter is called in train.py, I get this error

2020-09-29T13:06:22.156137293Z Epoch 0010 | Loss 16.3989 | time 24.4101
2020-09-29T13:06:22.160739045Z Traceback (most recent call last):
2020-09-29T13:06:22.160764766Z   File "train.py", line 257, in <module>
2020-09-29T13:06:22.160772024Z     train(args)
2020-09-29T13:06:22.16077712Z   File "train.py", line 185, in train
2020-09-29T13:06:22.160782517Z     ranks, loss = model.evaluate_filter(batch_data, (s_hist, s_hist_t), (o_hist, o_hist_t), global_model, total_data)
2020-09-29T13:06:22.160788149Z   File "/output/re-net/model.py", line 387, in evaluate_filter
2020-09-29T13:06:22.160793458Z     loss, sub_pred, ob_pred = self.predict(triplet, s_hist, o_hist, global_model)
2020-09-29T13:06:22.16079868Z   File "/output/re-net/model.py", line 337, in predict
2020-09-29T13:06:22.160803723Z     inp, _ = self.aggregator.predict((s_history, s_history_t), s, r, self.ent_embeds, self.rel_embeds[:self.num_rels], self.graph_dict, self.global_emb, reverse=False)
2020-09-29T13:06:22.160809008Z   File "/output/re-net/Aggregator.py", line 223, in predict
2020-09-29T13:06:22.160814046Z     graph_dict, global_emb)
2020-09-29T13:06:22.160818952Z   File "/output/re-net/utils.py", line 277, in get_s_r_embed_rgcn
2020-09-29T13:06:22.160824419Z     g_list, g_id_dict = get_g_list_id(neighs_t, graph_dict)
2020-09-29T13:06:22.16082952Z   File "/output/re-net/utils.py", line 169, in get_g_list_id
2020-09-29T13:06:22.160834744Z     g_list.append(make_subgraph(graph_dict[tim], neighs_t[tim]))
2020-09-29T13:06:22.160839784Z   File "/output/re-net/utils.py", line 121, in make_subgraph
2020-09-29T13:06:22.16084499Z     relabeled_nodes.append(g.ids[node])
2020-09-29T13:06:22.16084988Z KeyError: 3685
2020-09-29T13:06:26.799168428Z 

What could be the cause of this? Is there any special assumption on how the input data should be formed?

I've put in this gist the train, test, valid and stat files I used.

woojeongjin commented 4 years ago

Hi, this might be because there should be no overlapping times between train, valid, test sets. e.g., times(train) < times (valid) < times(test). If the error happens, please leave a comment!

iLampard commented 2 years ago

Hi nareto, have you fixed the problem? i run a test on a dataset with no overlapping times, but still get the same keyerror in make_subgraph.

hummingg commented 8 months ago

I run a test on ICEWS14 and get the same keyerror

dnanad commented 8 months ago

We solved this issue, as suggested by @woojeongjin, by making sure there are no overlapping time stamps.

Hi nareto, have you fixed the problem? i run a test on a dataset with no overlapping times, but still get the same keyerror in make_subgraph.

dnanad commented 8 months ago

That's interesting! I had no trouble with ICEWS14. But it's been a while that I last checked. I will try to rerun and see if I get the same error.

I run a test on ICEWS14 and get the same keyerror