Error with pertained vocal

XinhaoLi74 / SmilesPE

SMILES Pair Encoding: A data-driven substructure representation of chemicals

https://xinhaoli74.github.io/SmilesPE/

Apache License 2.0

181 stars 31 forks source link

Error with pertained vocal #10

Open MohaiminDev opened 2 years ago

MohaiminDev commented 2 years ago

When I used: spe_vob= codecs.open('../../data/processed/pretrained_tokenizer/SPE_ChEMBL.txt') spe = SPE_Tokenizer(spevob) I am getting the following error!!

Error: invalid line 1 in BPE codes file: The line should exist of exactly two subword units, separated by whitespace__