keras-team / autokeras

AutoML library for deep learning
http://autokeras.com/
Apache License 2.0
9.16k stars 1.4k forks source link

Bug: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 205. in Chinese path, not in English path #1887

Open huangliang0828 opened 1 year ago

huangliang0828 commented 1 year ago

Bug Description

The local Python files (mainly autokeras files)reported the following error in Chinese directory path, but was used normally in English directory/path. File D:\ProgramData\miniconda3\envs\autokeras_1\lib\site-packages\tensorflow\python\eager\execute.py:54 in quick_execute tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 205: invalid continuation byte.

Dataset used from sklearn.datasets import fetch_20newsgroups: fetch_20newsgroups( subset="train", shuffle=True, random_state=42, categories=categories)) or others tf.keras.utils.get_file("train.csv", TRAIN_DATA_URL)
even “mnist.load_data() ” “ imdb.load_data()”

I don't know what the problem is.

Setup Details

Include the details about the versions of:

Additional context

CMD set, chcp 655001(UTF-8) or 936(gbk)

all files run into UnicodeDecodeError at the clf.fit(.....) Search: Running Trial #1 Hyperparameter |Value |Best Value So Far text_block_1/bl...|vanilla |?
........ optimizer |adam |?
learning_rate |0.001 |?

Epoch 1/100 Traceback (most recent call last):

Cell In[7], line 1 clf.fit(doc_train, label_train,epochs=100, verbose=2) File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\autokeras\tasks\text.py:160 in fit history = super().fit( File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\autokeras\auto_model.py:292 in fit history = self.tuner.search( File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\autokeras\engine\tuner.py:193 in search super().search( File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\keras_tuner\engine\base_tuner.py:179 in search results = self.run_trial(trial, *fit_args, *fit_kwargs) File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\keras_tuner\engine\tuner.py:304 in run_trial obj_value = self._build_and_fit_model(trial, args, copied_kwargs) File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\autokeras\engine\tuner.py:101 in _build_and_fitmodel , history = utils.fit_with_adaptive_batch_size( File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\autokeras\utils\utils.py:88 in fit_with_adaptive_batch_size history = run_with_adaptive_batch_size( File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\autokeras\utils\utils.py:101 in run_with_adaptive_batch_size history = func(x=x, validation_data=validation_data, fit_kwargs) File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\autokeras\utils\utils.py:89 in batch_size, lambda kwargs: model.fit(kwargs), **fit_kwargs File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\keras\utils\traceback_utils.py:67 in error_handler raise e.with_traceback(filtered_tb) from None File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\tensorflow\python\eager\execute.py:54 in quick_execute tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 205: invalid continuation byte