meganz / sdk

MEGA C++ SDK
BSD 2-Clause "Simplified" License
1.32k stars 504 forks source link

Filesystem mismatch, leading to sync creation failure #2680

Open ChiwTheNeko opened 5 days ago

ChiwTheNeko commented 5 days ago

Sometimes the MEGASync app will fail to restart sync and dump an error saying "Local filesystem mismatch". Removing and re adding the sync will fix the issue but only until the computer reboot. After reboot the problem may appear again.

The logs contains the following:

07/01-02:05:22.895966 139871605753536 DTL  Opening mount database: /proc/mounts [fs.cpp:2064]
07/01-02:05:22.896110 139871605753536 DTL  Path /home/nicolas/Pictures is on device / [fs.cpp:2185]
07/01-02:05:22.896258 139871605753536 WARN Falling back to legacy filesystem fingerprint: /home/nicolas/Pictures [fs.cpp:2332]
07/01-02:05:22.896261 139871605753536 ERR  Sync root path is a different Filesystem than when the sync was created. Original filesystem id: (fingerprint: 739184207686459804, uuid: 703cbf35-3e25-436c-aa86-5eae34675e70)  Current: (fingerprint: 739184207686459804, uuid: undefined) [sync.cpp:659]
07/01-02:05:22.896262 139871605753536 ERR  Sync creation failed, syncerr: 15 [sync.cpp:3829]

The logs shows that MEGASync failed to identify the device the sync is on. The sync /home/nicolas/Pictures is not on device / but on /dev/sda1. This miss identification of device cause MEGASync to fallback to the unreliable legacy filesystem fingerprint system. The legacy fingerprints are not persistent across reboot which lead to the "Local filesystem mismatch" error.

After investigation the problem appears to be into function static std::string deviceOf(const std::string& database, const std::string& path) in file src/posicx/fs.cpp.

At line 2172 the code execute realpath(device.c_str(), storage.data()) which check if the device is a symlink and store the real path into the string storage. But storage may already contains some data that is longer than the string stored by realpath().

Therefore, at line 2186, the code attempts to trim storage to the first null character.

// Truncate storage down to size.
storage.erase(std::find(storage.begin(), storage.end(), '\x0'));

This line appears to be the problem. With these parameters storage.erase will only erase one character from storage, not the entire end of the string. As a result storage will be something like

/dev/sda1n/user/1000/docfuse.portalrw,nosuid,nodev,relatime,user_id=1000,group_id=10000 0de=700,uid=1000,gid=1000,inode640 0 0

Proposed fix would be to replace line 2186 by the following:

storage.erase(std::find(storage.begin(), storage.end(), '\x0'), storage.end());

That way the storage will be properly trimmed.

Note This problem has already been reported on the MEGASync project. See: https://github.com/meganz/MEGAsync/issues/964 https://github.com/meganz/MEGAsync/issues/844 https://github.com/meganz/MEGAsync/issues/947

I believe all of these are the same problem, but the issue should have been reported on the sdk instead.

jasongodev commented 4 days ago

I'm on Arch Linux and I often encounter this issue when Arch is updated. On next restart it will surely show this mismatch issue.

Does this make sense with the current discovery you made?

I hope your fix will be reviewed by the MEGA team. They almost don't look at this repo. Maybe they do look at pulls, but they don't collaborate. They just upload their latest code without closing issues/pulls.

ChiwTheNeko commented 4 days ago

I'm on Arch Linux and I often encounter this issue when Arch is updated. On next restart it will surely show this mismatch issue.

Does this make sense with the current discovery you made?

Maybe, maybe not. What you are describing looks very much like the issue I am proposing a fix for. But another user reported a problem with the same error message (filesystem mismatch) but with slightly different logs. That makes me wonder if maybe there could be more than one bug in this part of the code. But I didn't notice anything obvious.

If you want to help me investigate further you can check this report. Please post your logs just like he did and try to compile and run the test code I posted there (you will have to change the path in the main() function). If the test code successfully detect the device the path is on then my fix will work for you. If not that means there is another issue.

There is activity of this repo. I opened a case with MEGA support and I sent them my report. They said they forwarded it to the devs so, at least, they should know about it.

dwmega commented 3 days ago

Hey @ChiwTheNeko,

Nice analysis and bug hunting! You have indeed caught and identified a bug :)

I'm the author of this code and I'm sorry to have caused you and others any inconvenience: We've been fighting some issues around the "filesystem fingerprinting" code for quite some time due to the unreliability of the legacy filesystem fingerprinting mechanism. It's been difficult for us to fix in a clean way without pulling in other dependencies which we cannot support on all the targets that we build for.

It seems the bug you've found simply slipped through review: We'll jump on that, ASAP.

Thanks again for your time and analysis.

dwmega commented 3 days ago

Hey @ChiwTheNeko,

I've created a ticket here in our bug system to fix the issue: my manager will assign someone to fix it shortly.

There are two versions of the same bug that you've found: The one you found on line 2186 and another on line 2309.

If you're able to build locally or if you're just curious, one way to solve the issue would be to use a different overload of find.

Instead of:

storage.erase(std::find(storage.begin(), storage.end(), '\x0'));

Use

storage.erase(storage.find('\x0'));

Instead.

Hopefully the issue will be closed soon and you guys will be away syncing again in no time.