thunlp / OpenKE

An Open-Source Package for Knowledge Embedding (KE)
3.83k stars 985 forks source link

sysmalloc Assertion failed #263

Closed BarryRun closed 4 years ago

BarryRun commented 4 years ago

Dear Xu: Sorry to bother you! When using OpenKE to train transE, I got the following output:

Loading data... Input Files Path : ./benchmarks/TRANSE/ The toolkit is importing datasets. The total of relations is 1. The total of entities is 5562. The total of train triples is 5561. python: malloc.c:2401: sysmalloc: Assertion (old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0) failed. Aborted (core dumped)

My code is shown below, which is just a copy of the example by changing the data path

import openke
from openke.config import Trainer, Tester
from openke.module.model import TransE
from openke.module.loss import MarginLoss
from openke.module.strategy import NegativeSampling
from openke.data import TrainDataLoader, TestDataLoader

# dataloader for training
print('Loading data...')
train_dataloader = TrainDataLoader(
    in_path = "./benchmarks/TRANSE/", 
    nbatches = 100,
    threads = 8,
    sampling_mode = "normal",
    bern_flag = 1,
    filter_flag = 1,
    neg_ent = 25,
    neg_rel = 0)

# dataloader for test
# test_dataloader = TestDataLoader("./benchmarks/FB15K237/", "link")

print('Defining the model...')
# define the model
transe = TransE(
    ent_tot = train_dataloader.get_ent_tot(),
    rel_tot = train_dataloader.get_rel_tot(),
    dim = 200, 
    p_norm = 1, 
    norm_flag = True)

print('Defining the loss...')
# define the loss function
model = NegativeSampling(
    model = transe, 
    loss = MarginLoss(margin = 5.0),
    batch_size = train_dataloader.get_batch_size()
)

print('Start train!')
# train the model
trainer = Trainer(model = model, data_loader = train_dataloader, train_times = 1000, alpha = 1.0, use_gpu = True)
trainer.run()
transe.save_checkpoint('./transe.ckpt')

# test the model
# transe.load_checkpoint('./checkpoint/transe.ckpt')
# tester = Tester(model = transe, data_loader = test_dataloader, use_gpu = True)
# tester.run_link_prediction(type_constrain = False)

This error occurs in the function TrainDataLoader()

I am sure that my data have a right format according to the tutorial. Notably, I have only one kind of relationship here, which may cause this error. My train2id.txt, entity2id.txt, relation2id.txt are attached, you can have a check. entity2id.txt relation2id.txt train2id.txt

Hope to get your answer soon! @THUCSTHanxu13 Thanks a lot.

Yours, Rhys

BarryRun commented 4 years ago

I have resolved this issue by making all my entity_id consecutive(I found that I miss several id bwteen 0 and max id) I will close this issue, thanks again!