Closed SreehariSankar closed 1 year ago
Hi, we will update the code later. As an interim solution, you can take the files in the example directory as train/validation data.
Thanks! Just one last question: Could you please let me know how to tokenize a pre-processed AMR graph so that I can directly input it to the BART model? You can assume I have all the things (BART model ready, Tokenizer ready, etc). Should i just take the amr graph as a string and pass it to the tokenizer and then pass it to BART? Thanks! Much appreciated.
Hi, we implement an AMRBartTokenizer to tokenize pre-processed AMRs. To use the tokenizer, you can do the following steps:
tokenizer = AMRBartTokenizer.from_pretrained(
"facebook/bart-large",
)
amr_ids = [tokenizer.amr_bos_token_id] + tokenizer.tokenize_amr(amr_string.split())[:max_src_length - 2] + [tokenizer.amr_eos_token_id]
If you use AMRBART, please follow here to add special tokens.
Thanks! All the best with your ACL 2022!
This bug has been fixed.
To run inference (say AMR-->Text) train, test and validation sets are required. Please provide a way to run the model on a pre-processed text file without needing all this data.