haiwen / seafile

High performance file syncing and sharing, with also Markdown WYSIWYG editing, Wiki, file label and other knowledge management features.
http://seafile.com/
Other
12.25k stars 1.54k forks source link

Client stops syncing, seaf-daemon.exe terminates #1425

Closed galvanopus closed 8 years ago

galvanopus commented 8 years ago

There are 3 machines:

Machine name Configuration
nb Windows 10 + Seafile Client 4.3.4
seafile Windows Server 2012 R2 + Seafile Server 4.3.1
test Windows Server 2012 R2 + Seafile Client 4.3.4

Chronological log:

  1. I started to sync library from nb to seafile (about 150 GB).
  2. In the middle of syncing I started to sync the same library from seafile to test.
  3. Library sync from nb to seafile has finished successfully.
  4. Some time passed.
  5. seafile-applet.exe on test has terminated unexpectedly (with standard Windows error message).
  6. If I launch client on test now it says "Downloading 0%", seaf-daemon.exe uses 1 CPU core and after ≈7 mins seaf-daemon.exe terminates, client stops syncing. After restarting client (with or without restarting OS) the behavior is the same.

Here the client logs: https://gist.github.com/galvanopus/461f03a6c8b4872f1eba

shoeper commented 8 years ago

Have you tried to restart the server? It looks like it doesn't respond like the client expects it to.

galvanopus commented 8 years ago

I can try to restart server, but I think that client reports about broken connection with seaf-daemon.exe, not a server. And I am able to sync a small library on test without restarting server.

galvanopus commented 8 years ago

Here are some experiments I have performed:

  1. Stop client.
  2. Delete ~/ccnet and ~/Seafile from test.
  3. Launch client and connect to server.
  4. Sync small library — OK.
  5. Sync large library — seaf-daemon.exe terminates, only library folder with no subfolders is created.
  6. Restart server.
  7. Repeat steps 1 — 5. The result is the same.
killing commented 8 years ago

Hi,

Is the seafile.log you posted the complete one? I can't see why seaf-daemon terminates in that log. Can you send the complete seafile.log. Usually it's the most useful log file.

galvanopus commented 8 years ago

This is a full log, seaf-daemon terminates silently.

galvanopus commented 8 years ago

I am going to try to download the large library again. I have deleted ~/ccnet and ~/Seafile from test. Should I delete something more?

galvanopus commented 8 years ago

11 GB has been transferred, seaf-deamon is not working.

Client logs: https://gist.github.com/galvanopus/cb787b13b51eb31ad322 Server logs: https://gist.github.com/galvanopus/9d171d304c5d6c32ffd8 Server access.log: https://gist.github.com/galvanopus/a7ef20a22a021898136c

killing commented 8 years ago

Can you check, after seaf-daemon terminates, is ccnet.exe still running? From the log it seems seaf-daemon terminates because ccnet.exe terminates. Do you have anti-virus software running on the computer? Maybe the anti-virus software is blocking the communication between seaf-daemon and ccnet.

galvanopus commented 8 years ago

ccnet.exe does not terminate. No anti-virus software is running on test. I have restarted seaf-applet.exe, here are the results:

  1. seaf-deamon has terminated again after ≈3h.
  2. storage/fs/ content is exactly the same as it was after 1 fail (compared sha-1 and file listing).
  3. storage/blocks has 47.9 GB size.
  4. Downloaded library has 18.2 GB size.
galvanopus commented 8 years ago

After next seaf-applet restart.

  1. seaf-deamon has terminated again after ≈1h.
  2. storage/fs/ content is exactly the same.
  3. storage/blocks has 38.5 GB size. There are no new blocks.
  4. Downloaded library has 27.4 GB size.

Is there seaf-daemon debug build? Can I do something more to help to find the cause?

galvanopus commented 8 years ago

Can be related to #1212.

galvanopus commented 8 years ago

So, is this possible to turn seaf-daemon debug mode on?

killing commented 8 years ago

Unfortunately I don't see any way to debug the problem on Windows. Are you syncing huge amount of small files? If so seaf-daemon may be terminated due to memory constraint of the operating system. May be try to split the large library into smaller ones.

galvanopus commented 8 years ago

There are about 100 000 files taking 150 GB. Do you mean 2 GB per process constraint?

There is a long option called "debug" in seaf-daemon.c: { "debug", required_argument, NULL, 'D' }

How can I force seafile-applet to use it while launching seaf-daemon?

killing commented 8 years ago

You can turn on the debugging by setting SEAFILE_DEBUG=all environment variable on Windows. Just set the system wide environment variable. It should work. But I don't know whether there will be any useful debugging information about this problem.

When seaf-daemon terminates, what message does the system tells you in the pop-up window? Can you give us a screenshot?

galvanopus commented 8 years ago

seaf-daemon terminates silently. There are 3 Process Explorer screenshots at the termination moment. I suggest seaf-daemon does not reach 2 GB memory limit.

1 2 3

I have set SEAFILE_DEBUG environment variable. Will check the result.

galvanopus commented 8 years ago

Here are last 20 log lines:

[11/03/15 12:17:49] repo-mgr.c(4839): Checkout empty dir <here was path to dir>
[11/03/15 12:17:50] repo-mgr.c(4839): Checkout empty dir <here was path to dir>
[11/03/15 12:17:50] repo-mgr.c(4839): Checkout empty dir <here was path to dir>
[11/03/15 12:17:50] repo-mgr.c(4839): Checkout empty dir <here was path to dir>
[11/03/15 12:17:50] repo-mgr.c(4839): Checkout empty dir <here was path to dir>
[11/03/15 12:17:50] repo-mgr.c(4839): Checkout empty dir <here was path to dir>
[11/03/15 12:17:50] repo-mgr.c(4839): Checkout empty dir <here was path to dir>
[11/03/15 12:17:53] repo-mgr.c(4839): Checkout empty dir <here was path to dir>
[11/03/15 12:17:53] repo-mgr.c(4839): Checkout empty dir <here was path to dir>
[11/03/15 12:17:53] repo-mgr.c(4839): Checkout empty dir <here was path to dir>
[11/03/15 12:17:53] repo-mgr.c(4839): Checkout empty dir <here was path to dir>
[11/03/15 12:17:53] repo-mgr.c(4839): Checkout empty dir <here was path to dir>
[11/03/15 12:17:53] repo-mgr.c(4839): Checkout empty dir <here was path to dir>
[11/03/15 12:17:53] repo-mgr.c(4839): Checkout empty dir <here was path to dir>
[11/03/15 12:17:53] repo-mgr.c(4839): Checkout empty dir <here was path to dir>
[11/03/15 12:17:53] repo-mgr.c(4839): Checkout empty dir <here was path to dir>
[11/03/15 12:17:53] repo-mgr.c(4839): Checkout empty dir <here was path to dir>
[11/03/15 12:17:53] repo-mgr.c(4839): Checkout empty dir <here was path to dir>
[11/03/15 12:17:53] repo-mgr.c(4839): Checkout empty dir <here was path to dir>
[11/03/15 12:36:04] read from connfd error: No error.

Nothing helpful.

galvanopus commented 8 years ago

I was able to sync the library 2 times on test (virtual machine). Here are some findings.

  1. I relaunched seaf-applet after each seaf-daemon termination.
  2. seaf-daemon after start utilizes 1 core and logs a lot of messages:
repo-mgr.c(4491): wt and index are consistent. no need to checkout.
  1. A lot of blocks were requested from server several times (after each restart). This was found in access.log.
  2. seaf-daemon terminated after some amount of time after checking out directories:
>grep -B 1 "read from connfd error" seafile.log
[11/03/15 12:17:53] repo-mgr.c(4839): Checkout empty dir vCards.
[11/03/15 12:36:04] read from connfd error: No error.
--
[11/03/15 15:49:07] repo-mgr.c(4839): Checkout empty dir vCards.
[11/03/15 17:59:45] read from connfd error: No error.
--
[11/03/15 23:09:52] repo-mgr.c(4839): Checkout empty dir vCards.
[11/04/15 00:36:37] read from connfd error: No error.
--
[11/04/15 21:35:57] repo-mgr.c(4839): Checkout empty dir vCards.
[11/04/15 23:25:44] read from connfd error: No error.
  1. First time seaf-applet crashed during last relaunch. I noticed that seaf-daemon was running, waited for it to stop using CPU and disk and then launched seaf-applet, which told me that library was synced.
  2. Some files have different names: name.ext~ instead of name.ext.

Can seaf-daemon terminate if network throughput is higher than the disk one? What can be the next research step? Why file names differ?

Issues linked to this one might prove that there is something wrong in client. As I am going to is at home for all my data, I am really going to help to eliminate the problem.

galvanopus commented 8 years ago

I have limited download speed to 1 MB/s, seaf-daemon is downloading library. seaf-applet has crashed, I have created an issue #1432 for it.

galvanopus commented 8 years ago

Library has been synced after limiting download speed to 1 MB/s (network link between test and seafile is virtual, so network throughput is much more than virtual disk one). How to find why exactly seaf-daemon terminates?

galvanopus commented 8 years ago

Does anyone have any suggestions?

galvanopus commented 8 years ago

@killing, should this issue be moved to seafile-client or remain here? It seems the problem is in seaf-daemon, not client.

killing commented 8 years ago

Recently I've found a possible cause for the crash during download. Will be released in the next version.

galvanopus commented 8 years ago

@killing, fine! Could we close this issue after fix is released?

galvanopus commented 8 years ago

Thank you. I can test this issue after fix is released.

killing commented 8 years ago

Here is a test package containing the fix: https://app.seafile.de/f/2ccc2e94ae/?raw=1 Could you try it? We also added some more debug messages into the package. So the seafile.log would be interesting for us, no matter the bug is fixed or not.

killing commented 8 years ago

The fix has been released in 5.0.5. Have you tried it?

shoeper commented 8 years ago

Since there has been no feedback for one month I close it.

galvanopus commented 7 years ago

@killing I have just checked with server 6.0.6 + client 6.0.1. seaf-daemon.exe terminates under heavy disk usage.

galvanopus commented 7 years ago

@killing Could you reopen the issue? I am ready to provide debug info for seaf-daemon.exe if it is available. Btw, I found one more case when seaf-daemon terminates. I am uploading data (≈50 GiB) from mapped network drive, seaf-daemon has terminated several times. ccnet.exe is alive.

killing commented 7 years ago

I think it should be another problem. Could you open another issue? Could you find fresh dump files in C:\users\username\ccnet\logs\dumps after the crash? Make sure you use the latest version of client.