bjascob / amrlib

A python library that makes AMR parsing, generation and visualization simple.
MIT License
216 stars 33 forks source link

newly trained parse_xfm_bart models not generating first character #50

Closed bjascob closed 2 years ago

bjascob commented 2 years ago

For newly trained parse_xfm models using bart-X, when looking at the raw graph output the first character is missing. This does not impact the released models and I've only confirmed it on bart-base. It likely exists with bart-large but not with the t5 models.

The issue is due to the fact that the huggingface model config.json file has changed. There is now a line "forced_bos_token_id": 0 present. This changes the behavior. The change to the model config happened around the end of Feburary 2022. It looks like it is from this https://github.com/huggingface/transformers/issues/15559 issue.

Adding the line {..., "forced_bos_token_id" : null} to the config's model_args section appears to fix this. The fix needs to be tested with both bart models and t5-base needs to be verified to work correctly without any changes.

Note that it's not obvious that this is happening because the first character is always a "(" for all graphs and the deserializer can generally handle the missing start paren without issue. This means the bug may not introduce any visible changes, but could cause the smatch score to change a very small amount.

bjascob commented 2 years ago

Bart configs updated in commit https://github.com/bjascob/amrlib/commit/f25fe16998f1c14879e793a225c0174f4d7ac4f3. T5-base works with no changes. Bart-x and T5 re-trained models yield the following (new models were not uploaded)...

Training Results

All models trained with transformers 4.20.0 on 6/24/2022 Scores are without :wiki tags