TheurgicDuke771 commented 3 years ago

Hi,

Thanks for open-sourcing the project. I was trying this on a non gpu windows 10 machine (conda environment, python 3.7.9, pytorch 1.5) I was able to run Preprocess dataset, but got the bellow error while running Inference

(envs) Lenovo-PC MINGW64 /d/NLP/NL_to_SQL/gap-text2sql/rat-sql-gap (main) $ python run.py eval experiments/spider-configs/gap-run.jsonnet WARNING <class 'seq2struct.models.enc_dec.EncDecModel.Preproc'>: superfluous {'name': 'EncDec'} WARNING <class 'seq2struct.models.enc_dec.EncDecModel'>: superfluous {'decoder_preproc': {'grammar': {'clause_order': None, 'end_with_from': True, 'factorize_sketch': 2, 'include_literals': False, 'infer_from_conditions': True, 'name': 'spider', 'output_from': True, 'use_table_pointer': True}, 'save_path': 'data/spider-bart/nl2code-1115,output_from=true,fs=2,emb=bart,cvlink', 'use_seq_elem_rules': True}, 'encoder_preproc': {'bart_version': 'facebook/bart-large', 'compute_cv_link': True, 'compute_sc_link': True, 'db_path': 'data/spider-bart/database', 'fix_issue_16_primary_keys': True, 'include_table_name_in_column': False, 'save_path': 'data/spider-bart/nl2code-1115,output_from=true,fs=2,emb=bart,cvlink'}} Parameter containing: tensor([[-0.0370, 0.1117, 0.1829, ..., 0.2054, 0.0578, -0.0750], [ 0.0055, -0.0049, -0.0069, ..., -0.0030, 0.0038, 0.0087], [-0.0448, 0.4604, -0.0604, ..., 0.1073, 0.0310, 0.0477], ..., [-0.0138, 0.0278, -0.0467, ..., 0.0455, -0.0265, 0.0125], [-0.0043, 0.0153, -0.0567, ..., 0.0496, 0.0108, -0.0099], [ 0.0053, 0.0324, -0.0179, ..., -0.0085, 0.0223, -0.0020]], requires_grad=True) Updated the model with ./pretrained_checkpoint\pytorch_model.bin Parameter containing: tensor([[-0.0383, 0.1205, 0.1776, ..., 0.1973, 0.0594, -0.0699], [ 0.0046, -0.0023, -0.0084, ..., -0.0036, 0.0047, 0.0084], [-0.0460, 0.4671, -0.0650, ..., 0.1027, 0.0256, 0.0475], ..., [ 0.0086, 0.0037, 0.0363, ..., -0.0296, -0.0097, -0.0068], [-0.0160, 0.0123, 0.0015, ..., 0.0040, 0.0185, 0.0038], [-0.0049, -0.0121, -0.0235, ..., 0.0200, 0.0148, -0.0020]], requires_grad=True) Loading model from logdir/bart_run_1\bs=12,lr=1.0e-04,bert_lr=1.0e-05,end_lr=0e0,att=1\model_checkpoint-00041000 Traceback (most recent call last): File "run.py", line 104, in main() File "run.py", line 83, in main infer.main(infer_config) File "D:\NLP\NL_to_SQL\gap-text2sql\rat-sql-gap\seq2struct\commands\infer.py", line 215, in main model = inferer.load_model(args.logdir, args.step) File "D:\NLP\NL_to_SQL\gap-text2sql\rat-sql-gap\seq2struct\commands\infer.py", line 48, in load_model last_step = saver.restore(logdir, step=step, map_location=self.device, item_keys=["model"]) File "D:\NLP\NL_to_SQL\gap-text2sql\rat-sql-gap\seq2struct\utils\saver.py", line 122, in restore items2restore, model_dir, map_location, step) File "D:\NLP\NL_to_SQL\gap-text2sql\rat-sql-gap\seq2struct\utils\saver.py", line 40, in load_checkpoint item_dict[item_name].load_state_dict(checkpoint[item_name]) File "D:\NLP\NL_to_SQL\gap-text2sql\envs\lib\site-packages\torch\nn\modules\module.py", line 847, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for EncDecModel: size mismatch for decoder.rule_logits.2.weight: copying a param with shape torch.Size([97, 128]) from checkpoint, the shape in current model is torch.Size([94, 128]). size mismatch for decoder.rule_logits.2.bias: copying a param with shape torch.Size([97]) from checkpoint, the shape in current model is torch.Size([94]). size mismatch for decoder.rule_embedding.weight: copying a param with shape torch.Size([97, 128]) from checkpoint, the shape in current model is torch.Size([94, 128]). (envs) Lenovo-PC MINGW64 /d/NLP/NL_to_SQL/gap-text2sql/rat-sql-gap (main) $

Can you guide me, where I need to make changes.

Impavidity commented 3 years ago

Thank you for your interests. What kind of data you used for preprocessing? If you use the Spider, did you use both train_spider, train_other and dev, these three files? There are some mismatch in the size of embeddings between the trained model and the initialized model.

TheurgicDuke771 commented 3 years ago

Yes I used spider dataset only. Edit : Thanks for pointing out, as you mentioned it seems like the issue was with the dataset, I downloaded the spider dataset again and it works fine.

ujjawalcse commented 2 years ago

Hey @Impavidity @TheurgicDuke771 , I'm facing the same issue, I think there is no issue with data( Spider) I'm using.

Loading model from logdir/bart_run_1/bs=12,lr=1.0e-04,bert_lr=1.0e-05,end_lr=0e0,att=1/model_checkpoint-00041000

---------------------------------------------------------------------------

KeyError                                  Traceback (most recent call last)

<ipython-input-45-2534e992a832> in <module>()
----> 1 model = inferer.load_model(model_dir, checkpoint_step)

6 frames

/content/gap-text2sql/rat-sql-gap/seq2struct/models/variational_lstm.py in _hook_remove_dropout_masks_from_state_dict(cls, instance, state_dict, prefix, local_metadata)
     75     @classmethod
     76     def _hook_remove_dropout_masks_from_state_dict(cls, instance, state_dict, prefix, local_metadata):
---> 77         del state_dict[prefix + '_input_dropout_mask']
     78         del state_dict[prefix + '_h_dropout_mask']
     79 

KeyError: 'decoder.state_update._input_dropout_mask'

We can see the folder structure below,

spider_data_issue

Please guide me to tackle this issue. Thanks in advance.

ujjawalcse commented 2 years ago

@Impavidity , Also, getting error in executing the following

dataset = registry.construct('dataset_infer',{
   "name": "spider", "schemas": schema, "eval_foreign_key_maps": eval_foreign_key_maps, 
    "db_path": "data/sqlite_files/"
})

ValueError                                Traceback (most recent call last)

<ipython-input-57-de0cacb4ab13> in <module>()
      1 dataset = registry.construct('dataset_infer',{
      2    "name": "spider", "schemas": schema, "eval_foreign_key_maps": eval_foreign_key_maps,
----> 3     "db_path": "data/sqlite_files/"
      4 })

1 frames

/content/gap-text2sql/rat-sql-gap/seq2struct/utils/registry.py in instantiate(callable, config, unused_keys, **kwargs)
     42     signature = inspect.signature(callable.__init__)
     43     print('signature:',signature)
---> 44     for name, param in signature.parameters.items():
     45         print("name:",name)
     46         print("param:",param)

ValueError: Unsupported kind for param args: 2

Please help me out. Thanks again.

roburst2 commented 2 years ago

@ujjawalcse Did you resolve the issue I am also facing this KeyError: 'decoder.state_update._input_dropout_mask' issue

TheurgicDuke771 commented 2 years ago

Hi @roburst2 can you provide your environment info. As @ujjawalcse mention in other issue it was due to pytorch version mismatch. Please make sure you are using same environment describe in Setup section.

muruan01 commented 1 year ago

I also encountered the problem and my envrionment can't be changed to pytorch 1.5(mine is pytorch 1.9), so I changed the code of this file "variational_lstm.py", and change this function "_hook_remove_dropout_masks_from_state_dict", I modified the code to : if prefix + '_input_dropout_mask' in state_dict: del state_dict[prefix + '_input_dropout_mask'] if prefix + '_h_dropout_mask' in state_dict: del state_dict[prefix + '_h_dropout_mask'] Then the model works. Hope it can help.

Lam-Van-Toi commented 1 year ago

I have a problem "Attempting to infer on untrained model in {logdir}, step={step}". I fixed : ` def load_model(self, logdir, step): '''Load a model (identified by the config used for construction) and return it'''

1. Construct model

    model = registry.construct('model', self.config['model'], preproc=self.model_preproc, device=self.device)
    model.to(self.device)
    model.eval()

    # 2. Restore its parameters
    saver = saver_mod.Saver({"model": model})
    #last_step = saver.restore(logdir, step=step, map_location=self.device, item_keys=["model"])
    #if not last_step:
    #    raise Exception(f"Attempting to infer on untrained model in {logdir}, step={step}")
    return model`

What happens if I fix it like this?

awslabs / gap-text2sql

Error while running Inference #4

1. Construct model