utterworks / fast-bert

Super easy library for BERT based NLP models
Apache License 2.0
1.87k stars 341 forks source link

ValueError: Stop argument for islice() must be None or an integer: 0 <= x <= sys.maxsize. #285

Open jayralencar opened 3 years ago

jayralencar commented 3 years ago

Hi,

I am getting ValueError: Stop argument for islice() must be None or an integer: 0 <= x <= sys.maxsize.

My code:

databunch_lm = BertLMDataBunch.from_raw_corpus(
    data_dir=DATA_PATH,
    text_list = examples,
    tokenizer = loaded_tokenizer,
    batch_size_per_gpu=128,
    max_seq_length=32,
    model_type="bert",
    multi_gpu=False,
    logger=logger,
    test_size=0.01
)

Stack trace:

ValueError                                Traceback (most recent call last)
<ipython-input-22-ed19e05ac7d1> in <module>()
      9     multi_gpu=False,
     10     logger=logger,
---> 11     test_size=0.01
     12 )

3 frames
/usr/local/lib/python3.7/dist-packages/fast_bert/data_lm.py in from_raw_corpus(data_dir, text_list, tokenizer, batch_size_per_gpu, max_seq_length, multi_gpu, test_size, model_type, logger, clear_cache, no_cache)
    207             model_type=model_type,
    208             logger=logger,
--> 209             clear_cache=clear_cache,
    210             no_cache=no_cache,
    211         )

/usr/local/lib/python3.7/dist-packages/fast_bert/data_lm.py in __init__(self, data_dir, tokenizer, train_file, val_file, batch_size_per_gpu, max_seq_length, multi_gpu, model_type, logger, clear_cache, no_cache)
    279                 train_filepath,
    280                 cached_features_file,
--> 281                 self.logger,
    282                 block_size=self.tokenizer.max_len_single_sentence,
    283             )

/usr/local/lib/python3.7/dist-packages/fast_bert/data_lm.py in __init__(self, tokenizer, file_path, cache_path, logger, block_size)
    151             text = itertools.chain.from_iterable(text)
    152             text = more_itertools.chunked(text, block_size)
--> 153           self.examples = list(text)[:-1]
    154             # Note that we are loosing the last truncated example here for the sake of simplicity (no padding)

/usr/local/lib/python3.7/dist-packages/more_itertools/recipes.py in take(n, iterable)
     71 
     72     """
---> 73     return list(islice(iterable, n))
     74 
     75 

ValueError: Stop argument for islice() must be None or an integer: 0 <= x <= sys.maxsize.
wikd13 commented 3 years ago

same for me, have you found any way to solve it ?

az7dev commented 3 years ago

put int()