Document corner cases of Pull Order.

arkanoid87 commented 2 years ago

I'm using syncthing v1.19.1, Linux (64-bit Intel/AMD) running in docker container (image syncthing/syncthing) in ubuntu 20.04. The synced folder is a mounted folder ./sync:/var/syncthing

I've wrapped up a python script to grab all /rest/events, and it works as I'm receiving all event["id"] sequentially. By filtering event["type"] == ItemFinished and checking event["data"]["item"] contents, I've realized that the order of the synced files are somehow random, contrary to the file pull order configuration I've set on both folders (set as send & receive) equal to "Alphabetic".

I'm testing this by simply attaching my event listener, pausing the receiving folder, create 5 files in the sender folder like: echo foo > test_a, echo foo > test_aa, echo foo > test_aaa, echo foo > test_aaaa, echo foo > test_aaaaa and unpause the receiver

here what I get in console:

1414
ItemFinished test_a
1415
ItemFinished test_aaa
1416
ItemFinished test_aaaaa
1417
ItemFinished test_aa
1418
ItemFinished test_aaaa

I'm not a golang speaker, but if I'm not wrong this the relevant line (apparently is a no-op): https://github.com/syncthing/syncthing/blob/518d5174e630185540c5b99ee8a03c6231dc3c72/lib/model/folder_sendrecv.go#L436

Should I expect ItemFinished event to comply with the "File Pull Order"?

arkanoid87 commented 2 years ago

I've tested also:

pausing the sending folder
create files

unpause I experience same unordered results:

1457
ItemFinished test_aa
1458
ItemFinished test_aaaa
1459
ItemFinished test_a
1460
ItemFinished test_aaaaa
1461
ItemFinished test_aaa

I've also checked via inotifywait that the incoming file modification/close and also deletion comes in random order

arkanoid87 commented 2 years ago

UPDATE:

if I generate test files like this head -c 20MB < /dev/urandom > test_a, the behavior seems correct and I receive files in order:

ItemFinished test_a
158
ItemFinished test_aa
159
160
161
162
163
164
165
166
ItemFinished test_aaa
167
ItemFinished test_aaaa
168
169
170
171
172
173
174
175
176
177
178
179
180
ItemFinished test_aaaaa

It seems a condition that happens only when files are small? I went down testing this as I was experiencing random behavior while syncing a sqlite file + sqlite-wal

UPDATE 2: by generating 1KB random files head -c 1KB < /dev/urandom > test_a I can reproduce the error:

213
ItemFinished test_a
214
ItemFinished test_aa
215
ItemFinished test_aaaa
216
ItemFinished test_aaa
217
ItemFinished test_aaaaa

calmh commented 2 years ago

The sorting order controls how the files are queued. Files are started in the queued order, but there is concurrency and the amount of work required is individual to the file, so they may not finish in the order they started. Especially a bunch of small files will start at about the same time, take approximately no time to complete, and finish in whatever order.

arkanoid87 commented 2 years ago

So what's the point of deceiving the user?

If "File Pull Order" is a best effort thing, or is due to probability, race conditions or files size, it should be stated accordingly. In particular, one would pick alphabetical order when other stats wont discriminate files (eg.size) and you're saying that this is specifically the case that would finish in whatever order, in particular when dealing with small files.

calmh commented 2 years ago

There is no deception, files are processed in alphabetic order. But we don't wait for one file to complete before we start processing the next -- if we have buffer space to issue more network requests we do so, for example, and we also try to keep disks busy with concurrent I/O. So for example by default we will issue (network) requests for up 64 MiB of data before pausing to wait for replies. That's a lot of 1 KiB files. Granted, we'll send the requests in the order from the queue, but that's not necessarily the order they'll be answered in, and hence not the order things will necessarily finish in.

arkanoid87 commented 2 years ago

I'm not saying the current behavior is wrong, I understand that for performance reason CPU and network/disk usage has to be optimized, but the name of the setting IS deceiving: "File Pull Order" means "Order of arrival from the point of view of the pulling agent".

Manual states "Pull files ordered by X" when really should be "Order files by X on the sender before pulling. Warning, received files might end up in a different order due to resource availability and concurrency."

By reading the manual and considering the opportunity given by an option with such name, I spent time wrapping up a PoC solution based on reading syncthing events and pulled file order.

I bet other users might end up on this issue too. I suggest to document the sorting options appropriately.

acolomb commented 2 years ago

You're really just misunderstanding. Assume you have 10 ropes to pull on hanging from a tree, each with a coconut attached to it. If you pull them in a certain order, that's not necessarily the order the nuts will hit the ground around you, because the ropes may have different lengths.

Syncthing "pulling" in a certain order means asking other devices to send something. It has no control over the order of arrival unless it would wait for each file to finish before requesting the next, which would lead to terrible performance. The manual doesn't say anything else. Feel free to submit a docs PR if you have an idea how it could be communicated more clearly.

arkanoid87 commented 2 years ago

Just add a warning that the final order of arrival is unknown, as it's not software controlled anyway and relies on best effort strategy. Internet is full of this stuff, just inform the users accordingly, as you're exposing an option that means a preference, not a fact.

An ordered queue on the receiving side (or just on the event publisher?) would turn it into something actually usable, otherwise is just something you can't rely on.

I'll give you a practical example: synchronization of sqlite database with wal journaling, where two syncthing users have to take turns being the db writer in a walkie-talkie style. If the sqlite write doesn't close the database accordingly, you have a sqlite db file plus a wal file, and those are updated atomically according to sqlite specifications. The other ends needs to know if the database has been closed on not by receiving or not the wal file. If the wal file is received, it means the database is in use on the other end, otherwise it's free to go and being the new writer. If no receiving order preference is possible you will never know if a wal is on it's way after the db file, or the other way around, and the whole process becomes unreliable. This happens for every multi-file protocol, and there are many out there.

I have to collect the events syncthing generates over time and find a valid pattern over time to trigger my logic. Not rocket science, but the point is that any proposed pull order in syncthing options really means random.

AudriusButkevicius commented 2 years ago

It is an ordered queue on the receiving side, another problem is that there is no buffering.

Namely, we start downloading files as soon as we are informed about them, and there is no guarantee what order we will be informed about them, so zzz might be the first file we are informed about and that is alphabetically first, for the few seconds until aaa arrives etc.

Also, once we start downloading, we don't cancel, regardless that according to the ordering there are newer items in the queue.

Even if there was some sort of buffering, I think we would still not handle a completely empty folder syncing 1 million files etc, I think (didn't check the code), our queue for the purposes of ordering is not infinite sized.

So yes, personallt, I do agree, the docs should be updated to explain the edge cases, but the issue for that should probably live in the docs repo.

arkanoid87 commented 2 years ago

I agree, this should be moved to docs repo.

syncthing / docs

Document corner cases of Pull Order. #746