infiniflow / ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
https://ragflow.io
Apache License 2.0
20.13k stars 1.99k forks source link

[Bug]: Upload docx file failed via api #2953

Closed sven0219 closed 3 hours ago

sven0219 commented 4 hours ago

Is there an existing issue for the same bug?

Branch name

main

Commit ID

6a4858a7ee5ea17d31e88b91c74b9f0b9cc02439

Other environment information

No response

Actual behavior

I tried upload file via api, and it's successful when uploading pdf file, if I tried to upload docx file the logs show File uploaded successfully but no files in ragflow.

the follwing is my code

def upload_file_to_kb(base_url, file_path, kb_name, token, parser_id='naive'):
    url = f'{base_url}/v1/api/document/upload' 
    files = {'file': open(file_path, 'rb')}  
    data = {'kb_name': kb_name, 'parser_id': parser_id, 'run': '1'}  
    headers = {'Authorization': f'Bearer {token}'} 

    response = requests.post(url, files=files, data=data, headers=headers)

    if response.status_code == 200:
        print("File uploaded successfully")
    else:
        print("Failed to upload file:", response.status_code, response.text)

upload_file_to_kb(base_url=BASE_URL, file_path="./IT_Support.docx", kb_name=KNOWLEDGE_BASE_NAME, token=TOKEN)

Expected behavior

No response

Steps to reproduce

- upload `docx` file through

Additional information

No response

sven0219 commented 3 hours ago

this commit fixed it 6a4858a7ee5ea17d31e88b91c74b9f0b9cc02439