process more SELFIES successfully, minor improvements to predictions

SELFIES preprocessing failed for a substantial part of the dataset (9,509 molecules in chebi_v231) because some SMILES features are not covered by the selfies library -> I added RDKit normalisation of SMILES before translating to SELFIES (if direct translation fails) -> now, preprocessing only fails for 151 molecules
Fixes for prediction generation:
- The last (incomplete) batch of the dataset was lost before, is now saved as well
- The size of the saved prediction files remains constant (independent of the batch size used
Some tokens have been added (from graph datasets and new ChEBI versions)

ChEB-AI / python-chebai