"run_eval.py" error - Githubissues

Because I've never seen the error like this in this project, I don't have any solution of the error, sorry.

Although I only have Windows environment, I checked reproducibility of my code, so show the procedure below.

Clone this repo, move to the dir and check files.

$ git clone https://github.com/reppy4620/Dialog.git
Cloning into 'Dialog'...
remote: Enumerating objects: 14, done.
remote: Counting objects: 100% (14/14), done.
remote: Compressing objects: 100% (11/11), done.
remote: Total 383 (delta 6), reused 5 (delta 3), pack-reused 369 eceiving objects:  84% (322/383)
Receiving objects: 100% (383/383), 142.80 KiB | 420.00 KiB/s, done.
Resolving deltas: 100% (198/198), done.
$ cd Dialog
$ ls
config.py     LICENSE  make_training_data.py  notebooks/  result/      test.py       utils/
get_tweet.py  main.py  nn/                    README.md   run_eval.py  tokenizer.py

As I wanna download pretrained model during executing time, replace run_eval.py for shown below.

import pathlib
import torch

from config import Config
from nn import build_model
from tokenizer import Tokenizer
from utils import evaluate

# --------------------------------------------------------------------------------
# Donwload file from GoogleDrive
# --------------------------------------------------------------------------------
import requests
def download_file_from_google_drive(id, destination):
    URL = "https://docs.google.com/uc?export=download"

    session = requests.Session()

    response = session.get(URL, params = { 'id' : id }, stream = True)
    token = get_confirm_token(response)

    if token:
        params = { 'id' : id, 'confirm' : token }
        response = session.get(URL, params = params, stream = True)

    save_response_content(response, destination)    

def get_confirm_token(response):
    for key, value in response.cookies.items():
        if key.startswith('download_warning'):
            return value

    return None

def save_response_content(response, destination):
    CHUNK_SIZE = 32768

    with open(destination, "wb") as f:
        for chunk in response.iter_content(CHUNK_SIZE):
            if chunk: # filter out keep-alive new chunks
                f.write(chunk)
# --------------------------------------------------------------------------------

if __name__ == '__main__':

    data_dir = pathlib.Path('data')

    # mkdir if data_dir is not exists there.
    if not data_dir.exists():
        data_dir.mkdir(parents=True)

    # download pretrained weights.
    file_id = '1DubBO3fagYccVuKwrIVa4zGvsCzW7FCr'
    pretrained_file = data_dir / 'ckpt.pth'
    if not pretrained_file.exists():
        print('Download pretrained model')
        download_file_from_google_drive(file_id, pretrained_file)
        print('End')

    device = torch.device('cpu')

    state_dict = torch.load(pretrained_file, map_location=device)

    tokenizer = Tokenizer.from_pretrained(Config.model_name)

    model = build_model(Config).to(device)
    model.load_state_dict(state_dict['model'])
    model.eval()
    model.freeze()

    while True:
        s = input('You>')
        if s == 'q':
            break
        print('BOT>', end='')
        text = evaluate(Config, s, tokenizer, model, device, True)

By the way, config.py is like this.

class Config:
    seed = 116
    device = 'gpu'

    n_epoch = 3
    batch_size = 64
    max_len = 22
    lr = 1e-3
    betas = (0.9, 0.98)

    vocab_size = 32000
    num_head = 8
    d_model = 768
    num_layer = 6
    d_ff = 2048
    drop_rate = 0.1
    max_grad_norm = 1.0

    smoothing = 0.1
    factor = 2
    warmup = 4000

    # FIXME: Change path of training data.
    data_dir = './data'
    train_data_path = f'{data_dir}/train_data.txt'
    pickle_path = f'{data_dir}/train_data.pkl'
    fn = 'ckpt'

    load = False
    # FIXME: if you use original data, change flag of this
    use_pickle = True

    model_name = 'bert-base-japanese-whole-word-masking'

execute the run_eval.py

$ python run_eval.py
.  
.      a lot of logs made by transformers
.
You>こんにちは
BOT>こんにちは
You>

ex. versions

>>> import torch
>>> torch.__version__
'1.6.0'
>>> import transformers
>>> transformers.__version__
'2.3.0'

Now transformers has a newer version 3.3.1, but this repo doesn't adapts to changing interface of newer version.

I wish I could have checked above test on Ubuntu env.......

If you still met any errors, the errors may be caused by another factors, e.g. Ubuntu environment, version of python packages.

reppy4620 / Dialog

"run_eval.py" error #13