microsoft / rat-sql

A relation-aware semantic parsing model from English to SQL
https://arxiv.org/abs/1911.04942
MIT License
407 stars 117 forks source link

Asking for pre-trained model #22

Open mellahysf opened 4 years ago

mellahysf commented 4 years ago

Hi,

Can someone share with us its best pre-trained model (the best model_checkpoint) ?

Thank you all.

kalleknast commented 4 years ago

I have a bert-model trained on Spider that I can share.

mellahysf commented 4 years ago

Ok @kalleknast can you share it with me, please?

kalleknast commented 4 years ago

You can get it here. Please reply when you have downloaded it, so that I can remove it from google drive.

mellahysf commented 4 years ago

thank you @kalleknast very much

GhazalFallah commented 3 years ago

@mellahysf @kalleknast Hi! I am also looking for the pre-trained models but I can't find them. It seems they have not been published, am I right? If you have a well-working trained model, could you share it with me too? Especially the Glove model on Spider. Thank you so much!

kalleknast commented 3 years ago

I don't have Glove on Spider.

GhazalFallah commented 3 years ago

@kalleknast Bert is also ok if CPU is enough for inferencing. Thank you.

jayetri commented 3 years ago

@kalleknast @mellahysf Can you please share the pre-trained model for BERT with me too?

kalleknast commented 3 years ago

See issue #32. Post a message when you've downloaded it (so that I can remove it since it is taking up a lot of space on my google drive).

jayetri commented 3 years ago

I think that specific model was giving a 60% accuracy. Am I right @kalleknast ? Thank you so much for sharing it.

kalleknast commented 3 years ago

I think so. It's been some time since I checked the performance. Definitively not 65% or whatever the SOTA is.

PedroEstevesPT commented 3 years ago

Hi, I tried running the model but I am getting this error:

RuntimeError: Error(s) in loading state_dict for EncDecModel: size mismatch for decoder.rule_logits.2.weight: copying a param with shape torch.Size([94, 128]) from checkpoint, the shape in current model is torch.Size([97, 128]). size mismatch for decoder.rule_logits.2.bias: copying a param with shape torch.Size([94]) from checkpoint, the shape in current model is torch.Size([97]). size mismatch for decoder.rule_embedding.weight: copying a param with shape torch.Size([94, 128]) from checkpoint, the shape in current model is torch.Size([97, 128]).

Did you change the architecture kalleknast ? Or do you have any idea on how to solve that ?

kalleknast commented 3 years ago

@Muradean The model was trained in October last year. I haven't trained and uploaded any other rat-sql model since then.

The error could be due to a mismatch of the decoder vocabulary. I'm guessing that the model was trained with a decoder vocabulary of size 94, but a vocab of size 97 is expected. It may be due to some change to the Spider dataset (i.e. the addition of three new tokens) that occurred after the model was trained.

I trained 2 models with and without an expanded dataset, however, I think that I uploaded the model was trained on the original (unexpanded) Spider dataset. If I didn't, more people would have reported the same issue as you. I think the only solution is to train a new model. However, I dropped rat-sql for another project so I won't do it.

PedroEstevesPT commented 3 years ago

Thanks a lot for the reply and the clarification. I really appreciate it even though there seems to not be a quick fix for it.

If somebody could provide a new pre-trained model I would be very grateful.

PedroEstevesPT commented 3 years ago

Ok, after looking through the code I realized that the _fs parameter in rat-sql/configs/spider/nl2code-bert.jsonnet:

nl2codebert

Is responsible for picking the .asdl file in rat-sql/ratsql/grammars/, which can either be:

-> Spider.asdl -> Spider_f1.asdl -> Spider_f2.asdl

By default when one git clones the repo and runs the Spider-Bert model the .asdl picked is Spider_f2.asdl which at the time has 97 rules, however, your model @kalleknast has 94 rules.

The number of rules generated from the .asdl, (where I am having the mismatch can be seen when you run the docker) in: /app/data/spider/nl2code,output_from=true,fs=2,emb=bert,cvlink/grammar_rules.json

I tried preprocessing step again, this time using Spider.asdl but the grammar_rules.json ends up having 103 rules (so it also gives me a mismatch error when performing inference).

Finally, I changed the _fs to pick Spider_f1.asdl and repeated the preprocessing step but the generated grammar had 0 rules... So in order to solve that I did a dirty quick fix and changed the Spider_f1.asdl name to Spider_f2.asdl and reset the _fs to 2. However, the generated grammar had 73 rules. Neither of these values, 73,97,103 matches the 94 rules. Do you remember doing anything else when training the unexpanded dataset?

Thanks

kalleknast commented 3 years ago

@Muradean The model was trained with local _fs = 2;:

local _base = import 'nl2code-base.libsonnet';
local _output_from = true;
local _fs = 2;

However, I checked grammar_rules.json and see that it has 94 rules (len(data['all_rules'])). Spider_f2.asdl seems to be from 11 of July, 2020.

PedroEstevesPT commented 3 years ago

First @kalleknast,

Thanks a lot for the reply.

Then, there must be something happening in my preprocessing stage that causes the grammar_rules to have 97 instead of 94.

Could you share your 'nl2code,output_from=true,fs=2,emb=bert,cvlink' directory please ?

That will allow me to see the grammar rules that are differing and other problems that might be going on.

Thanks again

kalleknast commented 3 years ago

@Muradean You can get the nl2code,output_from=true,fs=2,emb=bert,cvlink directory here.

I noticed that the actual model is not linked to in this thread. It is here in case it is still useful and someone wants it.

PedroEstevesPT commented 3 years ago

THANKS!

PedroEstevesPT commented 3 years ago

For anyone that might encounter this problem in the future, these were the 3 extra rules I had: ['table_unit', 5] ['table_unit', 6] ['table_unit*', 7]

Evaeva19 commented 2 years ago

I have a bert-model trained on Spider that I can share.

Could you please share the bert-model trained on Spider again? The link before is out of date. Thanks!

kalleknast commented 2 years ago

It is here. See the post from Feb 3, 2021. Unless I'm lost and you're talking about some other model.