Open martinsvat opened 2 years ago
Btw: My assumption is that there is already a trained and stored StAR model from the previous execution. So I basically use "link_prediction.train" as the place for setting up things. Now, I'm looking that there is no "--init", so I guess that
config = config_class.from_pretrained(
args.config_name if args.config_name else args.model_name_or_path,
)
config.distance_metric = args.distance_metric
config.hinge_loss_margin = args.hinge_loss_margin
config.pos_weight = args.pos_weight
config.loss_weight = args.loss_weight
config.cls_loss_weight = args.cls_loss_weight
tokenizer = tokenizer_class.from_pretrained(
args.tokenizer_name if args.tokenizer_name else args.model_name_or_path,
do_lower_case=args.do_lower_case)
model = model_class.from_pretrained(args.model_name_or_path, config=config)
loads a stored model. Right?
best, MS
Thanks for your attention and sorry for the confusing code.
The way you used to load the model is right. For other problems, I'm confused also. Because the original code for this work is on a server that I can't connect to recently. I will check the code if I'm free. And please correct me if you find the reason.
Hi, thanks for such a quick response. Ok, but I can go through code only in this repository anyway.
Can you please point me which command (from the readme page) generates those 'test_head_full_scores.list' files for StAR models (even which method is responsible for that would help). I might debug it somehow, but as I wrote earlier, I could not find which part of the code generates the files.
best, Martin
Hi, I've been working with your codebase a while but there is one issue, which occurred along the path and I haven't been able to overcome it so far. For a start, my usage of StAR (ensemble, and others) is a little bit different. I don't work with batches, rather than that I need one single object (let's say EnsembeModel or StarModel) which I query using a triple and get a probability value of that particular triple in return.
So far, I managed to dig through the codebase to do the things I need, but when I started verifying my implementation with your results, I failed to obtain the same results. So, please, how can I replicate those values that are in StAR model, e.g. WN18RR_roberta-large/test_head_full_scores.list ?
I got that I should do something like
Right?
I've been following the method get_ensemble_data.py:get_scores which is buggy (namely because of _id2ent_list = dataset.id2entlist), and secondly, it is not invoked anywhere in the codebase (I haven't found such invocation :( hope I am mistaken). Anyhow, when I use this method to get the result for the first test-query (['06845599', '_member_of_domain_usage', '03754979']) in the WN18RR dataset I get something like 0.017418645322322845 which is not the value stored in WN18RR_roberta-large/test_head_full_scores.list (actually, loading this file, e.g. l = toch.load("test_head_full_scores.list"), and then inferring the value, e.g. l[0][1][l[0][0]], yields something like 0.9994868. Secondly, when I tried to implement the same method on my own, I ran into indeterminism, e.g. running the exact same script resulted in different scores (for the first test-query mentioned above). I checked that the embeddings are the same in both places (they are, I'm using your loading of embeddings), but the output of the model, e.g. model.classifier(_rep_src, _rep_tgt), differ every time. Have you seen anything like this? For example, is there some drop-out or something similar in (Ro)Berta model that should be set up prior to evaluation besides "model.eval()"?
So, you see that I actually want to produce a file with StAR scores on my own (e.g. WN18RR_roberta-large/test_head_full_scores.list) but am unable to do so (and deterministically). Please, can you point me to the place in the code where this is happening or say where I made any mistake? I would appreciate it all :) Thanks a lot.
best, Martin