jingyaogong / minimind

「大模型」3小时完全从0训练26M的小参数GPT,个人显卡即可推理训练!
https://jingyaogong.github.io/minimind
Apache License 2.0
2.7k stars 329 forks source link

执行python data_process.py报错 #74

Closed frozencoolcool closed 2 weeks ago

frozencoolcool commented 3 weeks ago

执行python data_process.py报如下错,请问大佬如何修改

chunk:10800 process end
chunk:10900 process end
Traceback (most recent call last):
  File "/root/autodl-fs/code/minimind/data_process.py", line 154, in <module>
    sft_process(contain_history=True)
  File "/root/autodl-fs/code/minimind/data_process.py", line 106, in sft_process
    process_and_write_data(data)
  File "/root/autodl-fs/code/minimind/data_process.py", line 79, in process_and_write_data
    df.to_csv(f'./dataset/{file_name}', mode='a', header=False, index=False, lineterminator='\r\n')
  File "/root/miniconda3/lib/python3.10/site-packages/pandas/util/_decorators.py", line 211, in wrapper
    return func(*args, **kwargs)
  File "/root/miniconda3/lib/python3.10/site-packages/pandas/core/generic.py", line 3720, in to_csv
    return DataFrameRenderer(formatter).to_csv(
  File "/root/miniconda3/lib/python3.10/site-packages/pandas/util/_decorators.py", line 211, in wrapper
    return func(*args, **kwargs)
  File "/root/miniconda3/lib/python3.10/site-packages/pandas/io/formats/format.py", line 1189, in to_csv
    csv_formatter.save()
  File "/root/miniconda3/lib/python3.10/site-packages/pandas/io/formats/csvs.py", line 261, in save
    self._save()
  File "/root/miniconda3/lib/python3.10/site-packages/pandas/io/formats/csvs.py", line 266, in _save
    self._save_body()
  File "/root/miniconda3/lib/python3.10/site-packages/pandas/io/formats/csvs.py", line 304, in _save_body
    self._save_chunk(start_i, end_i)
  File "/root/miniconda3/lib/python3.10/site-packages/pandas/io/formats/csvs.py", line 315, in _save_chunk
    libwriters.write_csv_rows(
  File "pandas/_libs/writers.pyx", line 72, in pandas._libs.writers.write_csv_rows
_csv.Error: need to escape, but no escapechar set
jingyaogong commented 3 weeks ago

image

可见80行的备注