Open iarroyof opened 1 year ago
¿Que common sense data utilizó para crear el setTrain.tsv?
Con cn_kb.csv y cn_kb_s.csv no me genera ningún error.
Hola, usé lo que dice el README:
set_train_modules.create_training_set(default=True,input_path="path/to/input/files/", common_sense_data=None)
transformer_predictor.train(dataFile="setTrain.tsv")
TypeError Traceback (most recent call last) Cell In[4], line 1 ----> 1 transformer_predictor.train(dataFile="setTrain.tsv")
File ~/anaconda3/envs/transferl/lib/python3.9/site-packages/historical_sources/transformer_predictor.py:87, in train(dataFile) 81 train_pairs = list( 82 map(functools.partial( 83 prepare_data), train_text)) 84 test_pairs= list( 85 map(functools.partial( 86 prepare_data), test_text)) ---> 87 train_in_texts = [pair[0] for pair in train_pairs] 88 train_out_texts = [pair[1] for pair in train_pairs] 90 input_vectorizer = layers.experimental.preprocessing.TextVectorization( 91 output_mode="int", max_tokens=max_features, 92 # ragged=False, # only for TF v2.7 93 output_sequence_length=sequence_length, 94 standardize=custom_standardization)
File ~/anaconda3/envs/transferl/lib/python3.9/site-packages/historical_sources/transformer_predictor.py:87, in(.0)
81 train_pairs = list(
82 map(functools.partial(
83 prepare_data), train_text))
84 test_pairs= list(
85 map(functools.partial(
86 prepare_data), test_text))
---> 87 train_in_texts = [pair[0] for pair in train_pairs]
88 train_out_texts = [pair[1] for pair in train_pairs]
90 input_vectorizer = layers.experimental.preprocessing.TextVectorization(
91 output_mode="int", max_tokens=max_features,
92 # ragged=False, # only for TF v2.7
93 output_sequence_length=sequence_length,
94 standardize=custom_standardization)
TypeError: 'NoneType' object is not subscriptable
In [5]: !head setTrain.tsv huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:
tokenizers
before the fork if possible