PaddlePaddle / PaddleHub

Awesome pre-trained models toolkit based on PaddlePaddle. (400+ models including Image, Text, Audio, Video and Cross-Modal with Easy Inference & Serving)【安全加固,暂停交互,请耐心等待】
https://www.paddlepaddle.org.cn/hub
Apache License 2.0
12.75k stars 2.07k forks source link

PaddleHub / demo / sequence_labeling / train.py 在paddle notebook运行这个代码报错,版本都已经安装最新的版本 #1471

Open AI-Mart opened 3 years ago

AI-Mart commented 3 years ago

代码链接:https://aistudio.baidu.com/bdvv1/user/271421/2052673/notebooks/2052673.ipynb

运行以下代码报错:

Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.

#

Licensed under the Apache License, Version 2.0 (the "License");

you may not use this file except in compliance with the License.

You may obtain a copy of the License at

#

http://www.apache.org/licenses/LICENSE-2.0

#

Unless required by applicable law or agreed to in writing, software

distributed under the License is distributed on an "AS IS" BASIS,

WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

See the License for the specific language governing permissions and

limitations under the License.

import paddle import paddlehub as hub from paddlehub.datasets import MSRA_NER

import ast import argparse

parser = argparse.ArgumentParser(doc) parser.add_argument("--num_epoch", type=int, default=3, help="Number of epoches for fine-tuning.") parser.add_argument( "--use_gpu", type=ast.literal_eval, default=True, help="Whether use GPU for fine-tuning, input should be True or False") parser.add_argument("--learning_rate", type=float, default=5e-5, help="Learning rate used to train with warmup.") parser.add_argument("--max_seq_len", type=int, default=128, help="Number of words of the longest seqence.") parser.add_argument("--batch_size", type=int, default=32, help="Total examples' number in batch for training.") parser.add_argument("--checkpoint_dir", type=str, default='./checkpoint', help="Directory to model checkpoint") parser.add_argument("--save_interval", type=int, default=1, help="Save checkpoint every n epoch.")

args = parser.parse_args()

if name == 'main': label_list = MSRA_NER.label_list label_map = {idx: label for idx, label in enumerate(label_list)}

model = hub.Module(
    name='ernie_tiny',
    version='2.0.1',
    task='token-cls',
    label_map=label_map,  # Required for token classification task
)

tokenizer = model.get_tokenizer()
train_dataset = MSRA_NER(tokenizer=tokenizer, max_seq_len=args.max_seq_len, mode='train')
dev_dataset = MSRA_NER(tokenizer=tokenizer, max_seq_len=args.max_seq_len, mode='dev')
test_dataset = MSRA_NER(tokenizer=tokenizer, max_seq_len=args.max_seq_len, mode='test')

optimizer = paddle.optimizer.AdamW(learning_rate=args.learning_rate, parameters=model.parameters())
trainer = hub.Trainer(model, optimizer, checkpoint_dir=args.checkpoint_dir, use_gpu=args.use_gpu)
trainer.train(
    train_dataset,
    epochs=args.num_epoch,
    batch_size=args.batch_size,
    eval_dataset=dev_dataset,
    save_interval=args.save_interval,
)
trainer.evaluate(test_dataset, batch_size=args.batch_size)

报错信息: usage: Automatically created module for IPython interactive environment [-h] [--num_epoch NUM_EPOCH] [--use_gpu USE_GPU] [--learning_rate LEARNING_RATE] [--max_seq_len MAX_SEQ_LEN] [--batch_size BATCH_SIZE] [--checkpoint_dir CHECKPOINT_DIR] [--save_interval SAVE_INTERVAL] Automatically created module for IPython interactive environment: error: unrecognized arguments: -f /home/aistudio/.local/share/jupyter/runtime/kernel-9bdd1ace-f437-4e19-b3a3-983a8c05fae6.json An exception has occurred, use %tb to see the full traceback. SystemExit: 2

KPatr1ck commented 3 years ago
代码链接:https://aistudio.baidu.com/bdvv1/user/271421/2052673/notebooks/2052673.ipynb

你好,你提供的notebook链接打不开,暂时没法看到你是怎么样执行脚本的,方便的话麻烦提供截图或者公开可见的项目。 关于PaddleHub/demo/sequence_labeling/train.py这个脚本,是没法直接在notebook中运行的,需要在命令行中用python train.py启动。

AI-Mart commented 3 years ago

这个是源码,麻烦看下

Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. import paddle import paddlehub as hub from paddlehub.datasets import MSRA_NER import ast import argparse parser = argparse.ArgumentParser(doc) parser.add_argument("--num_epoch", type=int, default=3, help="Number of epoches for fine-tuning.") parser.add_argument( "--use_gpu", type=ast.literal_eval, default=True, help="Whether use GPU for fine-tuning, input should be True or False") parser.add_argument("--learning_rate", type=float, default=5e-5, help="Learning rate used to train with warmup.") parser.add_argument("--max_seq_len", type=int, default=128, help="Number of words of the longest seqence.") parser.add_argument("--batch_size", type=int, default=32, help="Total examples' number in batch for training.") parser.add_argument("--checkpoint_dir", type=str, default='./checkpoint', help="Directory to model checkpoint") parser.add_argument("--save_interval", type=int, default=1, help="Save checkpoint every n epoch.") args = parser.parse_args() if name == 'main': label_list = MSRA_NER.label_list label_map = {idx: label for idx, label in enumerate(label_list)} model = hub.Module( name='ernie_tiny', version='2.0.1', task='token-cls', label_map=label_map, # Required for token classification task ) tokenizer = model.get_tokenizer() train_dataset = MSRA_NER(tokenizer=tokenizer, max_seq_len=args.max_seq_len, mode='train') dev_dataset = MSRA_NER(tokenizer=tokenizer, max_seq_len=args.max_seq_len, mode='dev') test_dataset = MSRA_NER(tokenizer=tokenizer, max_seq_len=args.max_seq_len, mode='test') optimizer = paddle.optimizer.AdamW(learning_rate=args.learning_rate, parameters=model.parameters()) trainer = hub.Trainer(model, optimizer, checkpoint_dir=args.checkpoint_dir, use_gpu=args.use_gpu) trainer.train( train_dataset, epochs=args.num_epoch, batch_size=args.batch_size, eval_dataset=dev_dataset, save_interval=args.save_interval, ) trainer.evaluate(test_dataset, batch_size=args.batch_size)

------------------ 原始邮件 ------------------ 发件人: "PaddlePaddle/PaddleHub" @.>; 发送时间: 2021年6月9日(星期三) 上午10:23 @.>; @.**@.>; 主题: Re: [PaddlePaddle/PaddleHub] PaddleHub / demo / sequence_labeling / train.py 在paddle notebook运行这个代码报错,版本都已经安装最新的版本 (#1471)

代码链接:https://aistudio.baidu.com/bdvv1/user/271421/2052673/notebooks/2052673.ipynb
你好,你提供的notebook链接打不开,暂时没法看到你是怎么样执行脚本的,方便的话麻烦提供截图或者公开可见的项目。 关于PaddleHub/demo/sequence_labeling/train.py这个脚本,是没法直接在notebook中运行的,需要在命令行中用python train.py启动。

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

AI-Mart commented 3 years ago

不能在notebook上运行,需要在命令行中用python train.py启动。请问能告知下原因吗?我看readme里面没说不能在notebook上运行,照理应该是可以的吧?

------------------ 原始邮件 ------------------ 发件人: "PaddlePaddle/PaddleHub" @.>; 发送时间: 2021年6月9日(星期三) 上午10:23 @.>; @.**@.>; 主题: Re: [PaddlePaddle/PaddleHub] PaddleHub / demo / sequence_labeling / train.py 在paddle notebook运行这个代码报错,版本都已经安装最新的版本 (#1471)

代码链接:https://aistudio.baidu.com/bdvv1/user/271421/2052673/notebooks/2052673.ipynb
你好,你提供的notebook链接打不开,暂时没法看到你是怎么样执行脚本的,方便的话麻烦提供截图或者公开可见的项目。 关于PaddleHub/demo/sequence_labeling/train.py这个脚本,是没法直接在notebook中运行的,需要在命令行中用python train.py启动。

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

KPatr1ck commented 3 years ago

不能在notebook上运行,需要在命令行中用python train.py启动。请问能告知下原因吗?我看readme里面没说不能在notebook上运行,照理应该是可以的吧?

脚本里有argparseif __name__ == "__main__",你需要删除相关语句后才能在notebook中执行。

AI-Mart commented 3 years ago

我在本地window10是可以直接运行的,这个和notebook上运行有区别吗?另外如果删除if name了就没有运行的脚本了啊

------------------ 原始邮件 ------------------ 发件人: "PaddlePaddle/PaddleHub" @.>; 发送时间: 2021年6月9日(星期三) 晚上9:00 @.>; @.**@.>; 主题: Re: [PaddlePaddle/PaddleHub] PaddleHub / demo / sequence_labeling / train.py 在paddle notebook运行这个代码报错,版本都已经安装最新的版本 (#1471)

不能在notebook上运行,需要在命令行中用python train.py启动。请问能告知下原因吗?我看readme里面没说不能在notebook上运行,照理应该是可以的吧?

脚本里有argparse 和if name == "main",你需要删除相关语句后才能在notebook中执行。

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.