sion-zcfei / CQG

The codes for ACL2022 paper “CQG: A Simple and Effective Controlled Generation Framework for Multi-hop Question Generation
Apache License 2.0
23 stars 8 forks source link

How to make proper parameter argument in python command line #6

Open asyrofist opened 1 year ago

asyrofist commented 1 year ago

Hello, I've been searched to fix the fastNLP module error.. In the latest version of fastNLP library, we can't use the library below:

from fastNLP.modules.attention import AttentionLayer, MultiHeadAttention
from fastNLP.embeddings import StaticEmbedding
from fastNLP.embeddings.utils import get_embeddings
from fastNLP.modules.decoder.seq2seq_state import State, LSTMState, TransformerState

from fastNLP.modules.decoder.seq2seq_decoder import Seq2SeqDecoder, State
from fastNLP.core.utils import _get_model_device

So we need to use torch to all that library

from fastNLP.modules.torch.attention import AttentionLayer, MultiHeadAttention
from fastNLP.embeddings.torch import StaticEmbedding
from fastNLP.embeddings.torch.utils import get_embeddings
from fastNLP.modules.torch.decoder.seq2seq_state import State, LSTMState, TransformerState
from fastNLP.modules.torch.decoder.seq2seq_decoder import Seq2SeqDecoder, State

can you give suggestion the proper fastNLP Library, that I suppose to use?

sion-zcfei commented 1 year ago

I write the code at Oct.2021, and I find the closed version of fastnlp is 0.6.0, maybe you can try it.

asyrofist commented 1 year ago

Thank you for your explanation before, I have been tried with installed the fastnlp 0.6.0

After several ways, I have experienced to solve it. Now, I look at the code again.. there are several mistakes from this line 58 at that code.

def forward(self, words, target, flag):
     ....
    loss = self.criterion(pred, gold.long()) # to solve the problem because expected scalar type long but not found Int
    return 'loss': loss

and also from this 266 line code, we should change to this syntax

def test(config, valid_dataloader, model, dev, tokenizer):
      ...
      generated_ids = model.generate(batch.input_ids.to(dev), batch.flag-to(dev))
     ...
     return BLEUscore

because in class Bert2tf(nn.module), there are only 3 function: __init__, forward, and generate. So from this method, we assume the modelas Bert2tf as mentioned in line 454. there's no module, but modules. So we decide to change from model.module.generate(...) to model.generate(...) in line

and then, if there's problem to solve the nltk, we must to install the argument like this syntax

import nltk
nltk.download('punkt')
nltk.download('wordnet')

but several ssl issue, we can use this syntax that source from this link

import nltk
import ssl

try:
    _create_unverified_https_context = ssl._create_unverified_context
except AttributeError:
    pass
else:
    ssl._create_default_https_context = _create_unverified_https_context

nltk.download()

and then for optional save only, i also change the code from this line to be like this one..

def test(config, valid_dataloader, model, dev):
     ...
    for pred, gold in zip(predictions_text, targets_text):

         dict1 = {'pred': pred, 'gold': gold}
        ...
        s = nltk.translate.bleu_score.sentence_bleu([gold, pred1])
        dict1.update(score= s)
        new_data.append(dict1)

   data_str = json.dumps(new_data, indent= 4, ensure_ascii= False)
   with open("case_text.json", "w", encoding= "utf-8") as f:
          f.write(data_str)

after that, we have finished all the argument from this code, we can get proper result.. when we are using the parameter as python main.py --device 'cpu' --batch_size 10 --num_epochs 2 it will run almost 1 hours..

and for last question, I want to make sure.. So is there other parameters option to get better result, without waiting too long? this is my repo from your repository