lobehub / lobe-chat

🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Azure / DeepSeek), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Vision/TTS) and plugin system. One-click FREE deployment of your private ChatGPT/ Claude application.
https://chat-preview.lobehub.com
Other
40.86k stars 9.31k forks source link

[Bug] Seems the knowledge base chunk does not work when public access is disabled(?) #4031

Closed xqliu closed 4 hours ago

xqliu commented 4 hours ago

📦 Environment

Vercel

📌 Version

v1.19.2

💻 Operating System

Windows

🌐 Browser

Chrome

🐛 Bug Description

https://github.com/user-attachments/assets/a6fc17d1-3834-42d5-a052-50b10367e175

Request URL: https://chat.muyan.cloud/trpc/lambda/file.getFiles?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22category%22%3Anull%2C%22knowledgeBaseId%22%3A%22kb_0qBJL2If6EQu%22%2C%22q%22%3Anull%2C%22sortType%22%3A%22desc%22%2C%22sorter%22%3A%22createdAt%22%2C%22showFilesInKnowledgeBase%22%3Afalse%7D%2C%22meta%22%3A%7B%22values%22%3A%7B%22category%22%3A%5B%22undefined%22%5D%7D%7D%7D%7D

Error when call the API directly:

[{"error":{"json":{"message":"UNAUTHORIZED","code":-32001,"data":{"code":"UNAUTHORIZED","httpStatus":401,"path":"file.getFiles"}}}}]
[
    {
        "result": {
            "data": {
                "json": [
                    {
                        "createdAt": "2024-09-13T07:01:22.124Z",
                        "fileType": "text/plain",
                        "id": "file_YQZo5zRCV1bM",
                        "name": "customization.groovy",
                        "size": 13119,
                        "updatedAt": "2024-09-13T08:28:34.965Z",
                        "url": "https://xxxxx",
                        "chunkCount": null,
                        "chunkingError": {
                            "body": {
                                "detail": "trigger file parse async task error. Please check your app is public available or check your proxy settings is set correctly."
                            },
                            "name": "TaskTriggerError"
                        },
                        "chunkingStatus": "error",
                        "embeddingError": null,
                        "embeddingStatus": null,
                        "finishEmbedding": false
                    },

📷 Recurrence Steps

  1. Enable the setting that user have to login to use the lobe-chat
  2. Upload knowledge base
  3. Try to chunk the file

🚦 Expected Behavior

Should be able to chunk the file successfully.

📝 Additional Information

I guess that's related to the setting that user have to login to use lobe-chat.

lobehubbot commented 4 hours ago

👀 @xqliu

Thank you for raising an issue. We will investigate into the matter and get back to you as soon as possible. Please make sure you have given us as much context as possible.\ 非常感谢您提交 issue。我们会尽快调查此事,并尽快回复您。 请确保您已经提供了尽可能多的背景信息。

dosubot[bot] commented 4 hours ago

@xqliu Hello there! I'm here to help you with any bugs, questions, or contributions. Let's work on this together while we wait for a human maintainer.

The issue you're encountering with the chunking process when public access is disabled is likely related to the authentication and public access settings. Specifically, the UNAUTHORIZED error and the TaskTriggerError suggest that the system is unable to trigger the chunking task due to authentication issues or the app not being publicly accessible.

Key Points to Check:

  1. Authentication Configuration: Ensure that your authentication settings are correctly configured. The file.getFiles endpoint requires authentication, and the UNAUTHORIZED error indicates that the user is not authenticated. Verify that the authentication middleware is correctly set up and that the user is properly authenticated before accessing this endpoint.

  2. Public Access Settings: The error message suggests checking if the app is publicly available or if proxy settings are correctly configured. If public access is disabled, ensure that the necessary environment variables and settings are adjusted to allow authenticated users to access the chunking functionality.

Relevant Configuration Settings:

Steps to Resolve:

  1. Verify Authentication: Ensure that the user is authenticated before accessing the file.getFiles endpoint. Check the authentication middleware and make sure it is correctly implemented.

  2. Check Public Access and Proxy Settings: Ensure that the app is accessible to authenticated users even if public access is disabled. Verify your proxy settings and make sure they are correctly configured to allow access to the necessary endpoints.

  3. Review Environment Variables: Double-check the relevant environment variables (S3_SET_ACL, CHUNKS_AUTO_EMBEDDING) and ensure they are set correctly based on your deployment and access requirements.

By addressing these points, you should be able to resolve the chunking issue when public access is disabled. If the problem persists, further investigation into the specific authentication and access configurations may be necessary [1][2].

To continue talking to Dosu, mention @dosu.

xqliu commented 4 hours ago

很奇怪,pdf 文件可以分块成功,groovy 文件,md 文件,png 文件这些都没法分块成功

lobehubbot commented 4 hours ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


It's strange that pdf files can be segmented successfully, but groovy files, md files, and png files cannot be segmented successfully.

arvinxx commented 4 hours ago

Vercel 上检查下 APP_URL 配置是否正确。

分块目前有些不支持,可以在这里提: https://github.com/lobehub/lobe-chat/discussions/3550

xqliu commented 4 hours ago

Looks like related to the knowledge base name contains Chinese? no idea...

I have changed the name of the knowledge base to LCDP (all English) and uploaded the md file again, and somehow it works now....

Interesting...

Thanks :)

xqliu commented 4 hours ago

Vercel 上检查下 APP_URL 配置是否正确。

分块目前有些不支持,可以在这里提: #3550

感谢~ 这个环境变量是设置了的,现在是…… 不确定为啥不工作~也不确定为啥又工作了,哈哈

lobehubbot commented 4 hours ago

✅ @xqliu

This issue is closed, If you have any questions, you can comment and reply.\ 此问题已经关闭。如果您有任何问题,可以留言并回复。

lobehubbot commented 4 hours ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Check on Vercel whether the APP_URL configuration is correct.

Chunking is currently not supported. You can mention it here: #3550

Thank you~ This environment variable was set, and now it is... Not sure why it doesn't work~ I'm not sure why it works again, haha