Open dvarelas opened 1 year ago
Here's the suspect file: https://github.com/dvarelas/Linguini/blob/master/linguini/preprocessing/encoders.py
and the suspect function:
def transform:
'\n Transforms sentences by indexing them\n\n :param sentences: Array of sentences\n :return: Indexed sentences\n '
max_len = self.fit(sentences)
all_tokens = []
all_masks = []
all_segments = []
for text in sentences:
text = text[:(max_len - 2)]
input_sequence = ((['[CLS]'] + text) + ['[SEP]'])
pad_len = (max_len - len(input_sequence))
tokens = self.tokenizer.convert_tokens_to_ids(input_sequence)
tokens += ([0] * pad_len)
pad_masks = (([1] * len(input_sequence)) + ([0] * pad_len))
segment_ids = ([0] * max_len)
all_tokens.append(tokens)
all_masks.append(pad_masks)
all_segments.append(segment_ids)
return [np.array(all_tokens), np.array(all_masks), np.array(all_segments)]
line 52, in transform for text in sentences: text = text[:max_len - 2] TypeError: 'NoneType' object is not subscriptable