haiwen / seafile

High performance file syncing and sharing, with also Markdown WYSIWYG editing, Wiki, file label and other knowledge management features.
http://seafile.com/
Other
12.25k stars 1.54k forks source link

[Sea RPC] Bad response: 102 processor is dead -and- [Sea RPC] Bad response: 515 peer down. #1490

Closed qnxor closed 8 years ago

qnxor commented 8 years ago

I upgraded to Pro 5.0.3 from 4.4.7 and my Mac OS X client stopped working. The Windows client 5.0.1 is working fine. Logs below.

All my libraries in the osx client stopped syncing and are showing as unsync'ed (cloud icon, instead of green check icon). If I try to manually sync by right click > sync Library, the client reports:

Failed to add download task: Transport Error

The upgrade process on the server was bumpy, see https://github.com/haiwen/seafile/issues/1485, but it now reports to be working.

~/.ccnet/logs/applet.log is full of

[01/02/16 05:33:34][Sea RPC] Bad response: 102 processor is dead.
[01/02/16 05:33:34][Sea RPC] Bad response: 102 processor is dead.
...

... it generates 9-10 of these every second!

~/.ccnet/logs/seafile.log says:

[01/02/16 05:25:15] seaf-daemon.c(519): starting seafile client 4.4.2
[01/02/16 05:25:15] seaf-daemon.c(521): seafile source code version 54e73b1072d509c47c7b2665ab1f463081a96edb
[01/02/16 05:25:15] ../common/mq-mgr.c(60): [mq client] mq cilent is started
[01/02/16 05:25:15] ../common/mq-mgr.c(106): [mq mgr] publish to hearbeat mq: seafile.heartbeat
[01/02/16 05:25:17] sync-mgr.c(660): Repo 'Dec' sync state transition from 'synchronized' to 'committing'.
[01/02/16 05:25:17] sync-mgr.c(660): Repo 'Enc' sync state transition from 'synchronized' to 'downloading'.
[01/02/16 05:25:17] http-tx-mgr.c(3930): Download with HTTP sync protocol version 1.
[01/02/16 05:25:17] http-tx-mgr.c(959): Transfer repo '72e9c40a': ('normal', 'init') --> ('normal', 'check')
[01/02/16 05:25:17] repo-mgr.c(3125): All events are processed for repo 1a8a02c1-eebf-4a56-ba7d-0b6df0d07b5f.
[01/02/16 05:25:17] sync-mgr.c(660): Repo 'Dec' sync state transition from 'committing' to 'initializing'.
[01/02/16 05:25:18] http-tx-mgr.c(959): Transfer repo '72e9c40a': ('normal', 'check') --> ('normal', 'commit')
[01/02/16 05:25:18] http-tx-mgr.c(959): Transfer repo '72e9c40a': ('normal', 'commit') --> ('normal', 'fs')
[01/02/16 05:25:18] http-tx-mgr.c(959): Transfer repo '72e9c40a': ('normal', 'fs') --> ('normal', 'data')
[01/02/16 05:25:18] repo-mgr.c(5238): Failed to open dir /Users/USER/Enc/matlab/sub/martin/VSProj/readpfile/x64: Error opening directory '/Users/USER/Enc/matlab/sub/martin/VSProj/readpfile/x64': No such file or directory.
[01/02/16 05:25:18] repo-mgr.c(5238): Failed to open dir /Users/USER/Enc/myExperiments/martin/data/1024: Error opening directory '/Users/USER/Enc/myExperiments/martin/data/1024': No such file or directory.
[01/02/16 05:25:18] repo-mgr.c(4525): File ZTUM is updated by user. Will checkout to conflict file later.

NOTE the last 3 entries ... Those folders are accessible just fine via the web interface and they also exist on my Windows machine. I tried deleting them but it didn't help. I also tried creating them on the my OSX machine but that also didn't help.

I also restarted the client and the server. No improvement.

I also reverted the server to v4.4.7 ... still no improvement. Could the storage database become corrupted? I doubt the problem is on the server because the Windows client seems to work fine.

Any clues?

qnxor commented 8 years ago

Also, when I start the OSX client, I can see the libraries icons turning into green checkmarks for a split second, and then they all turn browny clouds ...

~/.ccnet/logs/applet.log is continuously populating itself with Bad response: 102 processor is dead.

The above is logged 10 times per second ...

qnxor commented 8 years ago

The log entries in applet.log before the Bad response: 102 processor is dead. is:

[01/02/16 06:41:26]starting ccnet:  ("-c", "/Users/USER/.ccnet")
[01/02/16 06:41:27]trying to connect to ccnet daemon...

[01/02/16 06:41:27]connected to ccnet daemon

[01/02/16 06:41:27]starting seaf-daemon:  ("-c", "/Users/USER/.ccnet", "-d", "/Users/USER/Seafile/.seafile-data", "-w", "/Users/USER/Seafile")
[01/02/16 06:41:27]seafile daemon is now running
[01/02/16 06:41:27][Rpc Client] connected to daemon
[01/02/16 06:41:27][MessageListener] connected to daemon
[01/02/16 06:41:28]Unable to get config (int) value download_limit
[01/02/16 06:41:28]Unable to get config (int) value upload_limit
[01/02/16 06:41:28][Rpc Client] connected to daemon
[01/02/16 06:41:28]setDockIconStyle show failure, status code: -50

[01/02/16 06:41:28][Rpc Client] connected to daemon
[01/02/16 06:41:29]The latest version is 4.3.4
[01/02/16 06:41:30][Sea RPC] Bad response: 515 peer down.
[01/02/16 06:41:30][Sea RPC] Bad response: 102 processor is dead.
[01/02/16 06:41:30][Sea RPC] Bad response: 102 processor is dead.
[01/02/16 06:41:30][Sea RPC] Bad response: 102 processor is dead.
...

Note that the upload_limit and download_limit as well as the setDockIconStyle warnings were there before and previously it was working just fine.

However, the [Sea RPC] Bad response: 515 peer down. error is new. This was not there before.

I checked the Apache logs on the server, and the server does get connections from the OS X client.

ccnet.log on the server says:

[01/02/16 06:41:29] ../common/session.c(379): Accepted a local client
[01/02/16 06:41:29] ../common/peer.c(943): Local peer down

I'm at a loss ...

qnxor commented 8 years ago

I found a (very ugly!) workaround:

I tried without renaming/deleting those 2 folders and it didn't work. I still have no idea why it got corrupted.

NOTE: The only possible cause which the devs should look into is that recently I sync'ed some large files form my Windows machine, 4 GB and 1 GB files. One of those resided in one of the folders that the client's ccnet.log was complaining about not finding: /Users/USER/Enc/myExperiments/martin/data/1024.

So perhaps there is an issue with very large files ...

p.s. I did the above steps after reverting the server to 4.4.7 (from 5.0.3). I don't know if that helped or not. I'm now weary to upgrade again.

killing commented 8 years ago

The client seaf-daemon process has crashes, due to some reasons. Actually you may solve the problem by re-syncing the library. But since you have delete the client's metadata, there is no way to debug it further...

killing commented 8 years ago

Hi, I am just back from my new year holidays, so sorry for late reply. The log "Failed to open dir /Users/USER/Enc/myExperiments/martin/data/1024" is produced when some folder was deleted on the server and propagate to this Mac client. But I don't think this log message is related to the crash of seaf-daemon. And I don't think this issue is related to upgrade of the server. The seafile server's syncing component has been stable for a long time and hasn't be changed a lot. Sometimes we still find some crashing issues on the client. We'll appreciate that if we have the chance to debug it further.

qnxor commented 8 years ago

Obviously I had tried to re-sync the library before deleting the metadata ... absolutely nothing worked. Only deleting the metadata and restarting from scratch worked.

It happened after upgrading the server and after sync'ing some large files (4 GB and 1GB) from the Windows client, which didn't get updated on the OS X client. One of those two must be the cause. I'll post if it ever happens again.

EDIT: Feel free to change the title of this issue to what you think the cause is (I thought it was the server upgrade at the time)

killing commented 8 years ago

This may be related to a crash issue I solved recently. The fix will be included in the next version.

nezos commented 7 years ago

The solution provided by qnxor commented on Jan 2, 2016 worked for me with client version 6.1.0