NUSTM / VLP-MABSA

107 stars 10 forks source link

请问可以开源其他子任务的代码吗? #11

Open winder-source opened 1 year ago

winder-source commented 1 year ago

作者您好,我目前在研究相关的方向,对您这篇文章比较感兴趣,想请问您可以开源其他子任务的代码吗,貌似目前的代码只涉及一个下游任务,想再复现一下这篇论文

lyhuohuo commented 1 year ago

您好,其他子任务只需要在标签形式和测试端稍作修改就可以了。对于属性词抽取子任务,将标签序列中的情感去掉即可。对于情感分类子任务,训练的时候使用完整的span-情感序列,测试的时候给定所有的真实span进行情感标签的生成即可。

lyhuohuo commented 1 year ago

相关代码已经上传

winder-source commented 1 year ago

非常谢谢您!

winder-source commented 1 year ago

您好,我跑sh 15MASC_pretrain.sh的时候遇到了一些问题, 1.首先是这块

Traceback (most recent call last):
  File "twitter_sc_training.py", line 454, in <module>
    main(0, args)
  File "twitter_sc_training.py", line 172, in main
    start_idx=args.start_idx)
TypeError: __init__() got an unexpected keyword argument 'is_sample'

然后为了解决这个问题,我把参数注释掉了

    train_dataset = Twitter_Dataset(args.dataset[0][1],
                                    split='train')
                                    # is_sample=args.is_sample,
                                    # sample_num=args.sample_num,
                                    # start_idx=args.start_idx)

我把这块的参数注释掉了,因为Twitter_Dataset函数初始化的参数没有is_sample,sample_num,start_idx,为什么要加这三个参数呢?

class Twitter_Dataset(data.Dataset):
    def __init__(self, infos, split):
        self.infos = json.load(open(infos, 'r'))

2.改完后又出现另外的问题

Traceback (most recent call last):
  File "twitter_sc_training.py", line 454, in <module>
    main(0, args)
  File "twitter_sc_training.py", line 205, in main
    res_dev = eval_utils.eval(args, model, dev_loader, metric, device)
  File "/lyldata/VLP-MABSA-2/src/eval_utils.py", line 18, in eval
    for key, value in batch['TWITTER_SC'].items()
KeyError: 'TWITTER_SC'

看了一下貌似是collation.py和tokenization_new.py没有相关的代码?请问这部分代码可以补上吗?

lyhuohuo commented 1 year ago

这是我在跑小样本实验的时候加入的参数,非常抱歉代码没有更新完善,我这两天会更新。

winder-source commented 1 year ago

好的!等您更新

lyhuohuo commented 1 year ago

已经更新完毕

winder-source commented 1 year ago

Traceback (most recent call last): File "twitter_sc_training.py", line 450, in main(0, args) File "twitter_sc_training.py", line 80, in main tokenizer = ConditionTokenizer(args=args) File "/lyldata/VLP-MABSA-2/src/data/tokenization_new.py", line 43, in init pretrained_model_name, ) File "/root/anaconda3/envs/VLP-MABSA-env/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 1591, in from_pretrained list(cls.vocab_files_names.values()), OSError: Model name './E2E-MABSA' was not found in tokenizers model name list (facebook/bart-base, facebook/bart-large, facebook/bart-large-mnli, facebook/bart-large-cnn, facebook/bart-large-xsum, yjernite/bart_eli5). We assumed './E2E-MABSA' was a path, a model identifier, or url to a directory containing vocabulary files named ['vocab.json', 'merges.txt'] but couldn't find such vocabulary files at this path or url.

lyhuohuo commented 1 year ago

将这里的路径改为facebook/bart-base或者可以从huggingface上下载bart-base的模型文件,将路径改为下载下来的路径就可以了

winder-source commented 1 year ago

好的谢谢!

NanZhang257 commented 5 months ago

Traceback (most recent call last):回溯(最近一次调用最后一次): File "twitter_sc_training.py", line 450, in 文件“twitter_sc_training.py”,第 450 行,在 main(0, args)主(0, 阿格斯) File "twitter_sc_training.py", line 80, in main文件“twitter_sc_training.py”,第 80 行,主 tokenizer = ConditionTokenizer(args=args)tokenizer = 条件 Tokenizer(args=args) File "/lyldata/VLP-MABSA-2/src/data/tokenization_new.py", line 43, in 文件“/lyldata/VLP-MABSA-2/src/data/tokenization_new.py”,第 43 行,在init初始化 pretrained_model_name, )pretrained_model_name,) File "/root/anaconda3/envs/VLP-MABSA-env/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 1591, in from_pretrained文件“/root/anaconda3/envs/VLP-MABSA-env/lib/python3.7/site-packages/transformers/tokenization_utils_base.py”,第 1591 行,from_pretrained list(cls.vocab_files_names.values()),列表(cls.vocab_files_names.values()), OSError: Model name './E2E-MABSA' was not found in tokenizers model name list (facebook/bart-base, facebook/bart-large, facebook/bart-large-mnli, facebook/bart-large-cnn, facebook/bart-large-xsum, yjernite/bart_eli5). We assumed './E2E-MABSA' was a path, a model identifier, or url to a directory containing vocabulary files named ['vocab.json', 'merges.txt'] but couldn't find such vocabulary files at this path or url.OSError:在分词器模型名称列表中找不到模型名称“./E2E-MABSA”(facebook/bart-base、facebook/bart-large、facebook/bart-large-mnli、facebook/bart-large-cnn、facebook/bart-large-xsum、yjernite/bart_eli5)。我们假设 './E2E-MABSA' 是包含名为 ['vocab.json', 'merges.txt'] 的词汇文件的目录的路径、模型标识符或 URL,但在此路径或 url 中找不到此类词汇文件。

您好,请问一下这个问题解决了吗?