Open zhaisilong opened 2 months ago
from transformers import BertTokenizer, WordpieceTokenizer None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used. tokenizer = BertTokenizer(vocab_file="vocab_bpe_300.txt", do_lower_case=False, do_basic_tokenize=False) tokenizer.wordpiece_tokenizer = WordpieceTokenizer(vocab=tokenizer.vocab, unk_token=tokenizer.unk_token, max_input_chars_per_word=250) tokenizer("CC(NC(=O)C(=O)NCCCCC#N)c1cccc(C(F)(F)F)c1") {'input_ids': [2, 1, 3], 'token_type_ids': [0, 0, 0], 'attention_mask': [1, 1, 1]}
SPMM_models.py
print(text_input.input_ids[:4], prop[:4], text_input.input_ids.shape) exit()