langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
45.3k stars 6.36k forks source link

When importing the knowledge base in batches, I import it according to the specified csv file, but an error keeps appearing. How can I solve this problem? #6661

Closed 19157681683 closed 1 month ago

19157681683 commented 1 month ago

Self Checks

Dify version

0.6.11

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

  1. 知识库
  2. 添加分段
  3. 批量添加
  4. 导入csv文件 智能报销pdf的分段添加.csv

✔️ Expected Behavior

answer

❌ Actual Behavior

yes

crazywoola commented 1 month ago

It seems there is something wrong in your file.

"error": "'utf-8' codec can't decode byte 0x87 in position 10: invalid start byte"

@dosu Give me a fix based on this error

dosubot[bot] commented 1 month ago

Hey @crazywoola! I'm a bot here to help you with any bugs or questions you have. Let's work together to solve this!

To resolve the character encoding issue when importing the knowledge base in batches using the specified CSV file in Dify version 0.6.11, ensure that your CSV file is saved with UTF-8 encoding. This can typically be done in most spreadsheet software or text editors by selecting the appropriate encoding option when saving the file.

Here is a step-by-step guide to save a CSV file with UTF-8 encoding in common software:

Using Microsoft Excel:

  1. Open your CSV file in Excel.
  2. Click on File > Save As.
  3. Choose the location where you want to save the file.
  4. In the Save as type dropdown, select CSV UTF-8 (Comma delimited) (*.csv).
  5. Click Save.

Using Google Sheets:

  1. Open your CSV file in Google Sheets.
  2. Click on File > Download > Comma-separated values (.csv, current sheet).
  3. The file will be downloaded with UTF-8 encoding.

Using Notepad++:

  1. Open your CSV file in Notepad++.
  2. Click on Encoding in the menu bar.
  3. Select Encode in UTF-8.
  4. Click on File > Save As and save the file.

Using a Text Editor:

  1. Open your CSV file in your preferred text editor.
  2. When saving the file, look for an option to select the encoding (this is often found in the Save As dialog).
  3. Choose UTF-8 encoding and save the file.

Additionally, ensure that your CSV file structure adheres to the required format for batch importing in Dify. The CSV file must include columns for "question" and "answer" as specified in the Dify documentation. You can refer to the batch import section in Dify's interface for more details on the required structure.

By ensuring your CSV file is saved with UTF-8 encoding and follows the correct structure, you should be able to resolve any character encoding issues when importing the knowledge base in batches in Dify version 0.6.11 [1][2].

To continue talking to Dosu, mention @dosu.

19157681683 commented 1 month ago

明白了,我之前使用wps打开保存csv文件,后面使用Excel保存csv文件就可以了

yaohongfenglove commented 1 month ago

Many Chinese users use WPS's Excel software instead of Microsoft's Excel software. When the front-end reads CSV files, it is better to perform encoding judgment and automatic conversion