Closed parit closed 1 year ago
Hi @parit
The reason is the reaction SMILES is being canonicalized first. This increases the number of tokens to 513.
You will see that doing
mapped = mapper.get_attention_guided_atom_maps([smiles], canonicalize_rxns=False)
will not fail.
However, in general it is advisable to leave the canonicalization on.
Hello,
I am trying to generate AAM for the reaction https://www.rhea-db.org/rhea/56485.
The tokenizer length comes out to be 504 but I still get the error stating: "Token indices sequence length is longer than the specified maximum sequence length for this model (513 > 512). Running this sequence through the model will result in indexing errors"
Could someone please check why the token index is 513 and not 504?