drakkan / sftpgo

Full-featured and highly configurable SFTP, HTTP/S, FTP/S and WebDAV server - S3, Google Cloud Storage, Azure Blob
https://sftpgo.com
GNU Affero General Public License v3.0
9.52k stars 739 forks source link

[Bug]: sftpserver is duplicating files and skipping files when we have large number of files in S3 #1759

Closed svarmasscinc closed 5 days ago

svarmasscinc commented 2 months ago

⚠️ This issue respects the following points: ⚠️

Bug description

sftpserver is duplicating files and skipping files when we have large number of files in S3 We have around 200 files in S3 and when we download using sftp client like winscp or filezilla, we see duplicate files and some files are skipped.

Steps to reproduce

1.Have around 200 files in S3. 2.Setup sftp user to have permissions to download from S3. 3.Download files from S3 using winscp.

  1. You will see that some files are duplicated.
  2. Some files are skipped.

Expected behavior

All files should be download without duplication or skipped.

SFTPGo version

2.6.2 636a1c2c

Data provider

S3

Installation method

Community RPM package

Configuration

N/A

Relevant log output

N/A

What are you using SFTPGo for?

Medium business

Additional info

N/A

drakkan commented 1 month ago

200 is a very low number. I cannot replicate the reported issue sorry. Do you use standard S3 or same compatible implementation?

Please provide more information or investigate yourself and if this is a real bug please send a PR. Thanks

trondvh commented 1 month ago

We are experiencing a similar issue with the GCS storage provider configured under the filesystem in SFTPGo webadmin. Our file structure follows a pattern with files named "I01.txt" (with 5 to 6 different name variations), followed by "I02" and so on. When adding the 249th file to a folder, this file becomes hidden, and instead, the previous file (248) appears twice.

Interestingly, the correct file structure appears in the webClient, but SFTP clients like ForkLift and FileZilla miss the correct file 249 and instead display a duplicate of file 248. Furthermore, running an ls command via SFTP does not show file 249, although it can still be retrieved using the get command.

drakkan commented 1 month ago

We are experiencing a similar issue with the GCS storage provider configured under the filesystem in SFTPGo webadmin. Our file structure follows a pattern with files named "I01.txt" (with 5 to 6 different name variations), followed by "I02" and so on. When adding the 249th file to a folder, this file becomes hidden, and instead, the previous file (248) appears twice.

Interestingly, the correct file structure appears in the webClient, but SFTP clients like ForkLift and FileZilla miss the correct file 249 and instead display a duplicate of file 248. Furthermore, running an ls command via SFTP does not show file 249, although it can still be retrieved using the get command.

This info may help. Can you please share the exact list of your files and directories? Do you have any restrictions on file patterns or other settings that might affect the directory listing? Does this issue also occur with local file system or only using cloud storage backends?

I can't replicate it yet, but if you can, it should be relatively easy to figure out and fix. Thanks

trondvh commented 1 month ago

filestructure.zip

I've attached the file structure (with empty files) to illustrate the setup, and I can confirm there are no settings or restrictions on file patterns that would impact the listing. Uploading a file named I44NETKUND.TXT consistently triggers the issue: the new file becomes hidden, and the previous file appears twice in SFTP clients like ForkLift and FileZilla.

I haven’t been able to verify 100% that this only occurs on cloud storage providers, but I believe it's likely.

drakkan commented 1 month ago

I still cannot replicate sorry

Schermata del 2024-10-26 23-21-05

I tested using the latest development version. Could you try testing using the same version? Thank you

svarmasscinc commented 3 weeks ago

Hello,

We are using Dell EMC ECS storage. Our sftp server version is SFTPGo 2.6.2 636a1c2c. We have seen this issue with other versions too. Our files naming convention is XXX.823.DailyXXX.YYMMDD.010145304.csv. We usually have one file per day. In some cases, sftp skips one day and repeats the previous filename.

Sftp server skips 240910 and 240911 files, but showed 240908 and 240908 files twice. In otherworks the list is as below

XXX.823.DailyXXX.240913.010219524.csv XXX.823.DailyXXX.240912.010239436.csv XXX.823.DailyXXX.240909.010122652.csv XXX.823.DailyXXX.240909.010122652.csv XXX.823.DailyXXX.240908.010056344.csv XXX.823.DailyXXX.240908.010056344.csv XXX.823.DailyXXX.240907.010146203.csv XXX.823.DailyXXX.240906.133240549.csv

drakkan commented 3 weeks ago

Hello,

We are using Dell EMC ECS storage. Our sftp server version is SFTPGo 2.6.2 636a1c2. We have seen this issue with other versions too. Our files naming convention is XXX.823.DailyXXX.YYMMDD.010145304.csv. We usually have one file per day. In some cases, sftp skips one day and repeats the previous filename.

Sftp server skips 240910 and 240911 files, but showed 240908 and 240908 files twice. In otherworks the list is as below

XXX.823.DailyXXX.240913.010219524.csv XXX.823.DailyXXX.240912.010239436.csv XXX.823.DailyXXX.240909.010122652.csv XXX.823.DailyXXX.240909.010122652.csv XXX.823.DailyXXX.240908.010056344.csv XXX.823.DailyXXX.240908.010056344.csv XXX.823.DailyXXX.240907.010146203.csv XXX.823.DailyXXX.240906.133240549.csv

have you tested the latest development version or the latest version from the 2.6.x branch?

CVM commented 3 weeks ago

I am encountering the same issue on 2.6.2 using the GCS storage provider. I have also tried latest version on 2.6.x branch (b4acae85b8e39462176785592e2f7d9a6da37e7b) and am seeing the same behaviour there too.

The directory listing in the web client is correct, as noted by @trondvh, but accessing via SFTP client (FileZilla) yields an incorrect directory listing with duplicates.

However, I did also try using another client (WinSCP) and did not encounter the problem there, so it seems to be something specific to client behaviour. Having experimented a little further, I believe the issue relates to the use of concurrent requests by the SFTP client - when I disable concurrency in FileZilla the listing then starts coming back correctly.

I'd guess it's something related to large directory listings not having the results from multiple async requests pieced together quite right somehow? Either way, I hope this helps to narrow things down!

drakkan commented 3 weeks ago

@CVM can you please share your FileZilla version? I cannot replicate using FileZilla 3.67.1 and also the sftp CLI works fine for me (9.9p1). Thank you

CVM commented 3 weeks ago

My FileZilla version is also 3.67.1.

drakkan commented 2 weeks ago

Can you please test using the following SFTPGo instance?

Host: 172.234.213.221 (port 22) User: listdir Pwd: oNg2daidu7eu3Zith3tu

The root fs is backed by an S3 compatible bucket. The folder /s3 is backed by an AWS S3 bucket. The folder /gcs is backed by a Google Cloud Storage bucket.

You will find the directory structure shared by @trondvh in each of the above folders and I cannot replicate the reported issue.

If you can reproduce, please provide the exact steps including the versions of the SFTP clients you tested. Also test with the sftp CLI. Thank you.

This SFTPGo instance will remain active for 2-3 days.

:warning: Important note for those reading this post: do not expect this kind of support. Open source users should be able to self-support. This is an exception

CVM commented 2 weeks ago

I'm afraid I'm unable to reproduce the error on that server using any of my SFTP clients.

Please also disregard my previous comments regarding request concurrency settings on the client being a factor. The issue surfaced again today on my server and I've been running more tests and no longer believe this to be the case (my GCS bucket has a lifecycle policy that causes files beyond a certain age to expire - some files expired during my testing, bringing the file count down and resulting in a correct listing).

When connecting to my own affected server when the issue is occurring, the behaviour is consistent across all SFTP clients I've tried, with the same incorrectness in the directory listing present for WinSCP, FileZilla and the sftp CLI (9.7p1).

I don't think there's anything particularly unusual about my deployment. Compute Engine VM running Debian 12 Bookworm, following the APT install instructions to the letter. Happy to share any more specific details on request.

One very interesting thing I have also observed is that if I upload one or more files with names that place them first in the directory listing while the duplication behaviour is occurring, the filenames that are being duplicated actually change. So if in my numbered list of 500 files I have duplicates for 251 and 252 (with 253 and 254 absent), uploading a new file that comes first and refreshing the listing then shows duplicates for 250 and 251 (with 252 and 253 absent). So the issue does appear to affect whichever files are at a specific location within the listing.

For my particular use case, there's a high likelihood that SFTP clients will be accessing the server while several new files are in the process of being written. Sharing in case this might be a factor - perhaps it's an edge case relating to timing where the listing is being pulled during file writes? (I'm assuming directory listings are being cached in some way.)

drakkan commented 2 weeks ago

Thanks for testing

I'm afraid I'm unable to reproduce the error on that server using any of my SFTP clients.

that server is using the latest development version and not 2.6.2. You said you tested the development version too. Can you please post the output of sftpgo --version so I can see the exact commit you tested?

Please also disregard my previous comments regarding request concurrency settings on the client being a factor. The issue surfaced again today on my server and I've been running more tests and no longer believe this to be the case (my GCS bucket has a lifecycle policy that causes files beyond a certain age to expire - some files expired during my testing, bringing the file count down and resulting in a correct listing).

When connecting to my own affected server when the issue is occurring, the behaviour is consistent across all SFTP clients I've tried, with the same incorrectness in the directory listing present for WinSCP, FileZilla and the sftp CLI (9.7p1).

I don't think there's anything particularly unusual about my deployment. Compute Engine VM running Debian 12 Bookworm, following the APT install instructions to the letter. Happy to share any more specific details on request.

One very interesting thing I have also observed is that if I upload one or more files with names that place them first in the directory listing while the duplication behaviour is occurring, the filenames that are being duplicated actually change. So if in my numbered list of 500 files I have duplicates for 251 and 252 (with 253 and 254 absent), uploading a new file that comes first and refreshing the listing then shows duplicates for 250 and 251 (with 252 and 253 absent). So the issue does appear to affect whichever files are at a specific location within the listing.

For my particular use case, there's a high likelihood that SFTP clients will be accessing the server while several new files are in the process of being written. Sharing in case this might be a factor - perhaps it's an edge case relating to timing where the listing is being pulled during file writes? (I'm assuming directory listings are being cached in some way.)

Ok, if you refresh the file list while there is no upload in progress, does the problem go away or do you still see duplicates? For example, copy the contents of the folder where you see the problem to a test folder where no upload is expected.

However, please note that this type of support is for users with a support plan. I care about the project and I believe in Open Source, so since the problem is reported by several users I am trying to understand if this is a real bug, but I also think that what you reported has already been fixed in the development version, so unless you prove or insinuate doubt in me that this is not the case, I may stop responding. Thanks for understanding

trondvh commented 2 weeks ago

I don't think there's anything particularly unusual about my deployment. Compute Engine VM running Debian 12 Bookworm, following the APT install instructions to the letter. Happy to share any more specific details on request.

I was also running a Compute Engine VM on Debian 12 Bookworm, following the APT install instructions exactly as described. To troubleshoot, I tried using version 2.6.2 in a Docker container, set up on the same VM and pointing to the same data backend and Postgres database. Surprisingly, the issue did not occur in this setup—the directory listings were correct across all SFTP clients (WinSCP, FileZilla, and the sftp CLI).

drakkan commented 6 days ago

Please test with v2.6.3. Thank you

CVM commented 5 days ago

I can confirm that v2.6.3 fixes the issue for me.

Many thanks for the time you have been spending looking into this, @drakkan - much appreciated!

drakkan commented 5 days ago

I can confirm that v2.6.3 fixes the issue for me.

thanks for confirming

Many thanks for the time you have been spending looking into this, @drakkan - much appreciated!