infiniflow / ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
https://ragflow.io
Apache License 2.0
17.32k stars 1.77k forks source link

[Question]: Persistent Data Loss in Ragflow on Lightning AI Studio after Sleep #2080

Open cotaa956 opened 3 weeks ago

cotaa956 commented 3 weeks ago

Describe your problem

I'm running Ragflow on Lightning AI Studio due to limitations with my local machine. While the setup works initially, I encounter data loss whenever the studio instance goes to sleep. Upon restarting, I'm forced to rebuild the Docker images, resulting in the complete loss of: -User registration data: I need to re-register with Ragflow every time. -Knowledge base: All uploaded documents and indexed data are lost. -Server-side work: Any configurations or ongoing work within the Ragflow server is erased. This effectively renders Ragflow unusable for me on Lightning AI Studio as it requires a full re-setup after each sleep cycle. So any solution for that? Thanks

KevinHuSh commented 3 weeks ago

By default, all the data in ragflow is stored in docker volumes. So, rebuild images would not cause data damages. This command WILL REMOVE all data, be mind of it. docker compose down -v

cotaa956 commented 3 weeks ago

I've set up Ragflow by running docker compose up -d. I confirmed that the volumes and containers were created by using docker volume ls and docker ps. I was able to start the Ragflow server, create an account, and upload a file to the knowledge base without any issues. However, when the studio goes to sleep (becomes inactive) and I restart it, all the Docker containers and volumes disappear, even though the studio itself uses persistent storage. When I run docker compose up -d again, it rebuilds the images from scratch, resulting in a clean state. This means my account and the files I uploaded to the knowledge base are lost. I also attempted to back up and restore the volumes. While the backup and restore process seemed successful, it didn't resolve the issue. My account and files were still gone, and I had to start from scratch. It seems like something is causing the containers and volumes to be deleted when the studio goes to sleep, despite the persistent storage configuration. I'm looking for a solution to prevent this deletion and preserve my data between studio sessions. here are the images to illustrate what i say photo_1_2024-08-27_04-25-04 photo_2_2024-08-27_04-25-04 photo_3_2024-08-27_04-25-04 photo_2024-08-27_04-28-17

leoxu2024 commented 3 weeks ago

我和你有类似情况,数据丢失。我使用VMware Workstation建立了一个Ubuntu虚拟机,在Ubuntu上安装了docker desktop,然后运行ragflow,每次重启虚拟机后文件和解析后的数据都丢失了,还得重新注册、建知识库。docker/image都不存在了,找了好几天原因都没解决问题。求帮助和点醒。感谢。 @cotaa956 您是怎么做备份和还原的呢?方便说下吗?感谢。

leoxu2024 commented 3 weeks ago

By default, all the data in ragflow is stored in docker volumes. So, rebuild images would not cause data damages. This command WILL REMOVE all data, be mind of it. docker compose down -v

@KevinHuSh 我在VMware Workstation建立了一个Ubuntu虚拟机,在Ubuntu上安装了docker desktop,然后运行ragflow,每次重启虚拟机后文件和解析后的数据都丢失了,docker image 和docker container都消失了,是什么原因会导致的呢?有什么解决办法?感谢

leoxu2024 commented 3 weeks ago

By default, all the data in ragflow is stored in docker volumes. So, rebuild images would not cause data damages. This command WILL REMOVE all data, be mind of it. docker compose down -v

@KevinHuSh 我在VMware Workstation建立了一个Ubuntu虚拟机,在Ubuntu上安装了docker desktop,然后运行ragflow,每次重启虚拟机后文件和解析后的数据都丢失了,docker image 和docker container都消失了,是什么原因会导致的呢?有什么解决办法?感谢

我自己重新在VMware重新安装了一遍ubuntu,然后再重新安装了一次docker desktop,现在不会丢失docker image和docker了。感谢你们。

sultanjulyan commented 1 week ago

I have the same problem on my local machine when I restart Docker, the previously learned knowledge base data is lost, so I have to retrain the model.