Babelscape / AMR-alignment

This is the repo for Cross-lingual AMR Aligner, a novel aligner for Abstract Meaning Representation (AMR) graphs that can scale cross-lingually, presented at ACL 2023.
Other
4 stars 1 forks source link

The checkpoint model does not generate alignment #1

Open xiulinyang opened 11 months ago

xiulinyang commented 11 months ago

Hi, I tried to use the unguided checkpoint model and the command you provided to generate German AMR alignments but it just copied the input in the prediction folder I set. Below is the log message.

Also, there might be a typo in the README: the version of transformers should be > 3.0.

Many thanks in advance!

/local/xiulyang/anaconda3/envs/amralignment/lib/python3.8/site-packages/transformers/tokenization_utils_base.py:2211: FutureWarning: The `pad_to_max_length` argument is deprecated and will be removed in a future version, use `padding=True` or `padding='longest'` to pad to the longest sequence in the batch, or use `padding='max_length'` to pad to a max length. In this case, you can give a specific length with `max_length` (e.g. `max_length=45`) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert). warnings.warn( WARNING:root:Invalid sentence! 0%| | 0/100 [00:00<?, ?it/s]{'input_ids': tensor([[ 0, 407, 449, 20905, 3999, 939, 5039, 364, 4272, 748, 36767, 12128, 2156, 885, 324, 952, 4394, 1943, 5079, 8797, 1437, 1725, 997, 2156, 1076, 29, 1437, 1725, 28, 118, 4969, 19596, 1180, 506, 21251, 5689, 364, 5101, 2084, 16793, 2156, 30864, 8554, 34679, 1794, 821, 2753, 242, 2420, 90, 885, 6831, 242, 479, 2], [ 0, 4594, 9876, 605, 2723, 9306, 13235, 4832, 22, 9938, 20399, 90, 295, 1725, 1872, 479, 22, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [ 0, 10915, 1725, 858, 13235, 364, 179, 1811, 2001, 479, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [ 0, 8318, 2156, 19958, 1437, 1725, 2084, 1610, 2156, 16, 90, 70, 293, 842, 5039, 449, 22593, 479, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [ 0, 9938, 16, 90, 14001, 30864, 1023, 27785, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [ 0, 12611, 18965, 25666, 14839, 225, 1690, 12614, 620, 4279, 17487, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [ 0, 20963, 449, 8615, 620, 4279, 69, 2156, 449, 459, 5101, 9508, 462, 17487, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]], device='cuda:0'), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], device='cuda:0')} /pytorch/aten/src/ATen/native/BinaryOps.cpp:81: UserWarning: Integer division of tensors using div or / is deprecated, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead. ['Ġ</s>', 'Ġ<s>', 'W', 'ir', 'Ġha', 'ben', 'Ġe', 'uch', 'Ġe', 'uch', 'Ġe', 'uch', 'Ġv', 'orst', 'ellen', 'Ġ,', 'Ġw', 'ie', 'ĠÃ', 'Ġ</s>'] 0%| | 0/100 [00:00<?, ?it/s]

Carlosml26 commented 11 months ago

Hi there,

The issue is that the checkpoint released is the unguided model for AMR parsing in English (it is based in Bart). To move to other languages, you need a checkpoint trained in a cross-lingual fashion using mBart. We will release it soon, sorry for any inconvenience.