XinhaoLi74 / SmilesPE

SMILES Pair Encoding: A data-driven substructure representation of chemicals
https://xinhaoli74.github.io/SmilesPE/
Apache License 2.0
177 stars 30 forks source link

BasicSmilesTokenizer and atomwise_tokenizer outputs #21

Open IsraelAbebe opened 1 year ago

IsraelAbebe commented 1 year ago

thanks for the resources and i was exploring SMILES_SPE_Tokenizer , BasicSmilesTokenizer and atomwise_tokenizer outputs.

are BasicSmilesTokenizer and atomwise_tokenizer outputs only pre torkenizers if so how to old them to a tokenizer to have encoding and decoding feature, since they seem to be just splitting the smiles not encoding it.