Open winder-source opened 2 years ago
您好,其他子任务只需要在标签形式和测试端稍作修改就可以了。对于属性词抽取子任务,将标签序列中的情感去掉即可。对于情感分类子任务,训练的时候使用完整的span-情感序列,测试的时候给定所有的真实span进行情感标签的生成即可。
相关代码已经上传
非常谢谢您!
您好,我跑sh 15MASC_pretrain.sh的时候遇到了一些问题, 1.首先是这块
Traceback (most recent call last):
File "twitter_sc_training.py", line 454, in <module>
main(0, args)
File "twitter_sc_training.py", line 172, in main
start_idx=args.start_idx)
TypeError: __init__() got an unexpected keyword argument 'is_sample'
然后为了解决这个问题,我把参数注释掉了
train_dataset = Twitter_Dataset(args.dataset[0][1],
split='train')
# is_sample=args.is_sample,
# sample_num=args.sample_num,
# start_idx=args.start_idx)
我把这块的参数注释掉了,因为Twitter_Dataset函数初始化的参数没有is_sample,sample_num,start_idx,为什么要加这三个参数呢?
class Twitter_Dataset(data.Dataset):
def __init__(self, infos, split):
self.infos = json.load(open(infos, 'r'))
2.改完后又出现另外的问题
Traceback (most recent call last):
File "twitter_sc_training.py", line 454, in <module>
main(0, args)
File "twitter_sc_training.py", line 205, in main
res_dev = eval_utils.eval(args, model, dev_loader, metric, device)
File "/lyldata/VLP-MABSA-2/src/eval_utils.py", line 18, in eval
for key, value in batch['TWITTER_SC'].items()
KeyError: 'TWITTER_SC'
看了一下貌似是collation.py和tokenization_new.py没有相关的代码?请问这部分代码可以补上吗?
这是我在跑小样本实验的时候加入的参数,非常抱歉代码没有更新完善,我这两天会更新。
好的!等您更新
已经更新完毕
Traceback (most recent call last):
File "twitter_sc_training.py", line 450, in
将这里的路径改为facebook/bart-base或者可以从huggingface上下载bart-base的模型文件,将路径改为下载下来的路径就可以了
好的谢谢!
Traceback (most recent call last):回溯(最近一次调用最后一次): File "twitter_sc_training.py", line 450, in 文件“twitter_sc_training.py”,第 450 行,在 main(0, args)主(0, 阿格斯) File "twitter_sc_training.py", line 80, in main文件“twitter_sc_training.py”,第 80 行,主 tokenizer = ConditionTokenizer(args=args)tokenizer = 条件 Tokenizer(args=args) File "/lyldata/VLP-MABSA-2/src/data/tokenization_new.py", line 43, in 文件“/lyldata/VLP-MABSA-2/src/data/tokenization_new.py”,第 43 行,在init初始化 pretrained_model_name, )pretrained_model_name,) File "/root/anaconda3/envs/VLP-MABSA-env/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 1591, in from_pretrained文件“/root/anaconda3/envs/VLP-MABSA-env/lib/python3.7/site-packages/transformers/tokenization_utils_base.py”,第 1591 行,from_pretrained list(cls.vocab_files_names.values()),列表(cls.vocab_files_names.values()), OSError: Model name './E2E-MABSA' was not found in tokenizers model name list (facebook/bart-base, facebook/bart-large, facebook/bart-large-mnli, facebook/bart-large-cnn, facebook/bart-large-xsum, yjernite/bart_eli5). We assumed './E2E-MABSA' was a path, a model identifier, or url to a directory containing vocabulary files named ['vocab.json', 'merges.txt'] but couldn't find such vocabulary files at this path or url.OSError:在分词器模型名称列表中找不到模型名称“./E2E-MABSA”(facebook/bart-base、facebook/bart-large、facebook/bart-large-mnli、facebook/bart-large-cnn、facebook/bart-large-xsum、yjernite/bart_eli5)。我们假设 './E2E-MABSA' 是包含名为 ['vocab.json', 'merges.txt'] 的词汇文件的目录的路径、模型标识符或 URL,但在此路径或 url 中找不到此类词汇文件。
您好,请问一下这个问题解决了吗?
作者您好,我目前在研究相关的方向,对您这篇文章比较感兴趣,想请问您可以开源其他子任务的代码吗,貌似目前的代码只涉及一个下游任务,想再复现一下这篇论文