Closed guolingbing closed 4 years ago
Hi, I have some problems about the function load_data in Runner .
Specifically, in line 50-76:
sr2o obviously contains the data for testing and validating.
self.data = ddict(list) sr2o = ddict(set) for split in ['train', 'test', 'valid']: for line in open('./data/{}/{}.txt'.format(self.p.dataset, split)): sub, rel, obj = map(str.lower, line.strip().split('\t')) sub, rel, obj = self.ent2id[sub], self.rel2id[rel], self.ent2id[obj] self.data[split].append((sub, rel, obj)) if split == 'train': sr2o[(sub, rel)].add(obj) sr2o[(obj, rel+self.p.num_rel)].add(sub) self.data = dict(self.data) self.sr2o = {k: list(v) for k, v in sr2o.items()} for split in ['test', 'valid']: for sub, rel, obj in self.data[split]: sr2o[(sub, rel)].add(obj) sr2o[(obj, rel+self.p.num_rel)].add(sub)
Then, you generate the label based on sr2o.
self.sr2o_all = {k: list(v) for k, v in sr2o.items()} self.triples = ddict(list) for (sub, rel), obj in self.sr2o.items(): self.triples['train'].append({'triple':(sub, rel, -1), 'label': self.sr2o[(sub, rel)], 'sub_samp': 1})
You use self.triples['train'] to obtain data_iter
self.data_iter = { 'train': get_data_loader(TrainDataset, 'train', self.p.batch_size), 'valid_head': get_data_loader(TestDataset, 'valid_head', self.p.batch_size), 'valid_tail': get_data_loader(TestDataset, 'valid_tail', self.p.batch_size), 'test_head': get_data_loader(TestDataset, 'test_head', self.p.batch_size), 'test_tail': get_data_loader(TestDataset, 'test_tail', self.p.batch_size), }
and finally train the model.
train_iter = iter(self.data_iter['train']) for step, batch in enumerate(train_iter): self.optimizer.zero_grad() sub, rel, obj, label = self.read_batch(batch, 'train')
Did I misunderstand something?
Sorry, I overlooked this:
self.sr2o = {k: list(v) for k, v in sr2o.items()}
Hi, I have some problems about the function load_data in Runner .
Specifically, in line 50-76:
sr2o obviously contains the data for testing and validating.
Then, you generate the label based on sr2o.
You use self.triples['train'] to obtain data_iter
and finally train the model.
Did I misunderstand something?