dvarelas / Linguini

NLP framework built on top of TensorFlow
2 stars 0 forks source link

issue #29

Open dvarelas opened 1 year ago

dvarelas commented 1 year ago

line 52, in transform for text in sentences: text = text[:max_len - 2] TypeError: 'NoneType' object is not subscriptable

fixobot-404[bot] commented 1 year ago

Here's the suspect file: https://github.com/dvarelas/Linguini/blob/master/linguini/preprocessing/encoders.py


and the suspect function:

def transform: 

'\n        Transforms sentences by indexing them\n\n        :param sentences: Array of sentences\n        :return: Indexed sentences\n        '
max_len = self.fit(sentences)
all_tokens = []
all_masks = []
all_segments = []
for text in sentences:
    text = text[:(max_len - 2)]
    input_sequence = ((['[CLS]'] + text) + ['[SEP]'])
    pad_len = (max_len - len(input_sequence))
    tokens = self.tokenizer.convert_tokens_to_ids(input_sequence)
    tokens += ([0] * pad_len)
    pad_masks = (([1] * len(input_sequence)) + ([0] * pad_len))
    segment_ids = ([0] * max_len)
    all_tokens.append(tokens)
    all_masks.append(pad_masks)
    all_segments.append(segment_ids)
return [np.array(all_tokens), np.array(all_masks), np.array(all_segments)]