goodbai-nlp / AMRBART

Code for our paper "Graph Pre-training for AMR Parsing and Generation" in ACL2022
MIT License
92 stars 28 forks source link

Generate random AMRs #25

Closed xiulinyang closed 5 months ago

xiulinyang commented 5 months ago

Hi! I was trying to use the checkpoint to do inference on my own data, but I find that sometimes the model might generate strange AMRs like this:

 ( <pointer:0> console-01 :ARG0 ( <pointer:1> bug :ARG1-of ( <pointer:2> chase-01 :ARG0 ( <pointer:3> child :ARG0-of ( <pointer:4> nervous-01 ) ) :time ( <pointer:5> month :mod ( <pointer:6> last ) ) ) :mod <pointer:3> ) :ARG1 <pointer:3> )</AMR> ( <pointer:7> kid ) )</AMR></AMR></AMR> )</AMR>-of ( <pointer:8> consol-01 :ARG0 <pointer:1> :ARG1 <pointer:7> )</AMR> :time ( <pointer:9> month :mod <pointer:6> ) :mod ( <pointer:10> last ))</AMR></AMR> :ARG0 ( <pointer:11> i ) :ARG1 ( <pointer:12> person :wiki - :name ( <pointer:13> name :op1 <lit> The </lit> :op2 <lit> Big </lit> :op3 <lit> Bad </lit> :op4 <lit> Old </lit> :op5 <lit> One </lit> ) ) :ARG2 <pointer:7> :mod ( <pointer:14> only ) ) :ARG1-of <pointer:8> )</AMR> :ARG0 <pointer:11></AMR></AMR> :ARG3 ( <pointer:15> comfort-01 :ARG0 <pointer:11> :ARG1 <pointer:7> :time ( <pointer:16> now ) ) :op2 ( <pointer:17> next ) ) :ARG3-of ( <pointer:18> console-01 :ARG1 <pointer:7></AMR> ) :time <pointer:9> )</AMR> <lit></AMR></AMR> :time <pointer:16></AMR></AMR> :mod ( <pointer:19> today ) ) :ARG0-of</AMR></AMR> :ARG1 (</AMR></AMR> :ARG2 ( <pointer:69></AMR> :ARG0</AMR> :ARG1</AMR> ) :ARG3 (</AMR> now )</AMR> ) ) :quant</AMR></AMR> :op2</AMR></AMR>lic</AMR></AMR> :medium</AMR></AMR> :op3</AMR></AMR>BN</AMR></AMR> </lit></AMR></AMR> :frequency</AMR></AMR> :quant</AMR> ) :ARG0</AMR></AMR> <lit> )</AMR> :mod</AMR></AMR>now</AMR></AMR>roid</AMR></AMR>icks</AMR></AMR>oner</AMR></AMR> :instrument</AMR></AMR>yan</AMR></AMR>ably</AMR></AMR> privately</AMR></AMR>throp</AMR></AMR> :domain</AMR></AMR> :duration (</AMR> :time</AMR></AMR>Now</AMR></AMR> seconds</AMR></AMR> flesh</AMR></AMR>chen</AMR></AMR>B</AMR></AMR>illion</AMR></AMR>');</AMR></AMR> collectively</AMR></AMR> weekday</AMR></AMR> :consist-of</AMR></AMR>olt</AMR></AMR>new</AMR></AMR>ankind</AMR></AMR>));</AMR></AMR>Gener</AMR></AMR> hardcore</AMR></AMR> Blackburn</AMR></AMR> November</AMR></AMR>enna</AMR></AMR>cester</AMR></AMR>Face</AMR></AMR>");</AMR></AMR> Nov</AMR></AMR>neck</AMR></AMR>'),</AMR></AMR>UE</AMR></AMR>ainer</AMR></AMR>min</AMR></AMR>athi</AMR></AMR>gas</AMR></AMR>BC</AMR></AMR>aman</AMR></AMR>Sing</AMR></AMR>be</AMR></AMR> coral</AMR></AMR>fer</AMR></AMR>lar</AMR></AMR> have</AMR></AMR>Beg</AMR></AMR> Bearing</AMR></AMR>Proof</AMR></AMR>can</AMR></AMR> now</AMR></AMR>Two</AMR></AMR>bin</AMR></AMR>Be</AMR></AMR>external</AMR></AMR>semb</AMR></AMR>among</AMR></AMR>christ</AMR></AMR>Having</AMR></AMR>bling</AMR></AMR> Weeks</AMR></AMR>other</AMR></AMR> having</AMR></AMR> Citizen</AMR></AMR>tex</AMR></AMR>liction</AMR></AMR> Kimmel</AMR></AMR>deg</AMR></AMR> Various</AMR></AMR> Liter</AMR></AMR>ening</AMR></AMR> whisk</AMR></AMR> counted</AMR></AMR> Nature</AMR></AMR> Parenthood</AMR></AMR>ched</AMR></AMR>bearing</AMR></AMR> Having</AMR></AMR>ching</AMR></AMR>gin</AMR></AMR>oys</AMR></AMR>raise</AMR></AMR>che</AMR></AMR>animate</AMR></AMR>having</AMR></AMR> The</AMR></AMR> denote</AMR></AMR> Memorial</AMR></AMR>anthrop</AMR></AMR> Licensed</AMR></AMR> differences</AMR></AMR> euphem</AMR></AMR> fung</AMR></AMR>licted</AMR></AMR>atars</AMR></AMR>becue</AMR></AMR> Mens</AMR></AMR>add</AMR></AMR> enlisted</AMR></AMR> of</AMR></AMR>asion</AMR></AMR>chester</AMR></AMR>equipped</AMR></AMR>ometime</AMR></AMR> being</AMR></AMR>gener</AMR></AMR>Building</AMR></AMR>World</AMR></AMR> Motorsport</AMR></AMR>some</AMR></AMR> Roads</AMR></AMR> Blood</AMR></AMR> Clockwork</AMR></AMR>gan</AMR></AMR>Central</AMR></AMR>raised</AMR></AMR> Summoner</AMR></AMR>of</AMR></AMR> Months</AMR></AMR>neys</AMR></AMR>iday</AMR></AMR>ueller</AMR></AMR> sometime</AMR></AMR>phalt</AMR></AMR>'d</AMR></AMR>starter</AMR></AMR> occasions</AMR></AMR> membership</AMR></AMR> be</AMR></AMR> City</AMR></AMR>uers</AMR></AMR> Animals</AMR></AMR>deen</AMR></AMR>match</AMR></AMR>world</AMR></AMR> Racing</AMR></AMR>idences</AMR></AMR>isc</AMR></AMR> starters</AMR></AMR>together</AMR></AMR>Several</AMR></AMR>Gen</AMR></AMR>earing</AMR></AMR> able</AMR></AMR> Scroll</AMR></AMR> Goo</AMR></AMR> buff</AMR></AMR> Wedding</AMR></AMR>�</AMR></AMR> must</AMR></AMR>Club</AMR></AMR> months</AMR></AMR> rubbing</AMR></AMR>go</AMR></AMR> accelerate</AMR></AMR>isson</AMR></AMR>acking</AMR></AMR>cellaneous</AMR></AMR> rake</AMR></AMR>Go</AMR></AMR> tongues</AMR></AMR>carry</AMR></AMR> donor</AMR></AMR> Were</AMR></AMR> Built</AMR></AMR> advertising</AMR></AMR>condition</AMR></AMR> lips</AMR></AMR>gal</AMR></AMR> Styles</AMR></AMR> existed</AMR></AMR> samples</AMR></AMR>Demon</AMR></AMR>rounded</AMR></AMR>g</AMR></AMR> entitled</AMR></AMR> hired</AMR></AMR> classify</AMR></AMR>i</AMR></AMR> were</AMR></AMR> face</AMR></AMR> Organ</AMR></AMR>casting</AMR></AMR>making</AMR></AMR> Insurance</AMR></AMR>raising</AMR></AMR> listener</AMR></AMR> Food</AMR></AMR> chromos</AMR></AMR> fictional</AMR></AMR> pouring</AMR></AMR> generating</AMR></AMR>soc</AMR></AMR>el</AMR></AMR> making</AMR></AMR>Mania</AMR></AMR>asons</AMR></AMR>cies</AMR></AMR> Soc</AMR></AMR> Bing</AMR></AMR> joining</AMR></AMR> fictitious</AMR></AMR>buy</AMR></AMR>chet</AMR></AMR> affect</AMR></AMR>M</AMR></AMR> innocuous</AMR></AMR>bert</AMR></AMR> stim</AMR></AMR> Roof</AMR></AMR>called</AMR></AMR>connect</AMR></AMR> classified</AMR></AMR> Alright</AMR></AMR> rabid</AMR></AMR> become</AMR></AMR>ass</AMR></AMR>eding</AMR></AMR> congregation</AMR></AMR> facial</AMR></AMR> catering</AMR></AMR>aha</AMR></AMR>giving</AMR></AMR> match</AMR></AMR>asses</AMR></AMR>make</AMR></AMR> Action</AMR></AMR> some</AMR></AMR>air</AMR></AMR> Club</AMR></AMR> Society</AMR></AMR>oir</AMR></AMR>onga</AMR></AMR> kickoff</AMR></AMR> forming</AMR></AMR>dat</AMR></AMR> January</AMR></AMR>inski</AMR></AMR>Names</AMR></AMR> signatures</AMR></AMR> bond</AMR></AMR>connection</AMR></AMR> ticking</AMR></AMR> December</AMR></AMR>ong</AMR></AMR>pie</AMR></AMR> cards</AMR></AMR> fundraising</AMR></AMR> organs</AMR></AMR> cannibal</AMR></AMR> cater</AMR></AMR> occur</AMR></AMR> Gen</AMR></AMR>red</AMR></AMR> Regulatory</AMR></AMR>rag</AMR>

Is it normal? FYI, this is the code I used to generate AMRs:

from transformers import BartForConditionalGeneration
from model_interface.tokenization_bart import AMRBartTokenizer
from pathlib import Path
import argparse
from tqdm import tqdm
# Load tokenizer and model

parser = argparse.ArgumentParser(description='Process some integers.')
parser.add_argument('--input', type=str,
                    help='input document')
parser.add_argument('--output', type=str, help='the document to save')

args = parser.parse_args()
model = BartForConditionalGeneration.from_pretrained("xfbai/AMRBART-large-finetuned-AMR3.0-AMRParsing-v2")
tokenizer = AMRBartTokenizer.from_pretrained("xfbai/AMRBART-large-finetuned-AMR3.0-AMRParsing-v2")
max_length = model.config.max_length
print(max_length)
input_sents =  Path(args.input).read_text().strip().split('\n')
with open(args.output, 'w') as pred:
    for sent in tqdm(input_sents):
        input_ids = tokenizer.encode(sent, return_tensors="pt")
        output = model.generate(input_ids, max_length=1024)
        amr_graph = tokenizer.decode(output[0], skip_special_tokens=True)
        pred.write(f'{amr_graph}\n\n')

I find the max_length in model.config is 20, so I mannually set it to 1024. Thanks!

goodbai-nlp commented 5 months ago

Hi, @xiulinyang

You'd better follow the instructions here to inference on your own data, so that the quailty can be ensured.

xiulinyang commented 5 months ago

Hi! Thanks for your clarification. I read previous issues and somehow I thought I could just load the huggingface model. Now it works. Many thanks again!