Open nickmza opened 1 year ago
Hmm, we've found this to be unreliable
2024-08-29T16:33:08.422486723Z 2024-08-29 16:33:08.422 WARN 16 --- [.net:22/scanned] o.a.c.c.f.r.SftpConsumer : sftp://xxx@exxx.blob.core.windows.net:22/scanned?delay=300000&initialDelay=300000&knownHostsFile=%2Froot%2F.ssh%2Fknown_hosts&move=..%2FuploadOk&moveFailed=..%2FuploadFailed&password=xxxxxx&preMove=uploadInProgress&strictHostKeyChecking=yes cannot begin processing file: RemoteFile[1010113855.pdf] due to: Cannot rename file from: scanned/1010113855.pdf to: scanned/uploadInProgress/1010113855.pdf. Exchange[]. Caused by: [org.apache.camel.component.file.GenericFileOperationFailedException - Cannot rename file from: scanned/1010113855.pdf to: scanned/uploadInProgress/1010113855.pdf]
2024-08-29T16:33:08.422516449Z
2024-08-29T16:33:08.422520276Z org.apache.camel.component.file.GenericFileOperationFailedException: Cannot rename file from: scanned/1010113855.pdf to: scanned/uploadInProgress/1010113855.pdf
2024-08-29T16:33:08.422523301Z at org.apache.camel.component.file.remote.SftpOperations.renameFile(SftpOperations.java:553) ~[camel-ftp-4.4.1.jar:4.4.1]
2024-08-29T16:33:08.422525706Z at org.apache.camel.component.file.strategy.GenericFileProcessStrategySupport.renameFile(GenericFileProcessStrategySupport.java:146) ~[camel-file-4.4.1.jar:4.4.1]
2024-08-29T16:33:08.422528261Z at org.apache.camel.component.file.strategy.GenericFileRenameProcessStrategy.begin(GenericFileRenameProcessStrategy.java:47) ~[camel-file-4.4.1.jar:4.4.1]
2024-08-29T16:33:08.422530465Z at org.apache.camel.component.file.GenericFileConsumer.processExchange(GenericFileConsumer.java:387) [camel-file-4.4.1.jar:4.4.1]
2024-08-29T16:33:08.422532779Z at org.apache.camel.component.file.remote.RemoteFileConsumer.processExchange(RemoteFileConsumer.java:153) [camel-ftp-4.4.1.jar:4.4.1]
2024-08-29T16:33:08.422535023Z at org.apache.camel.component.file.GenericFileConsumer.processBatch(GenericFileConsumer.java:244) [camel-file-4.4.1.jar:4.4.1]
2024-08-29T16:33:08.422537298Z at org.apache.camel.component.file.GenericFileConsumer.poll(GenericFileConsumer.java:205) [camel-file-4.4.1.jar:4.4.1]
2024-08-29T16:33:08.422539502Z at org.apache.camel.support.ScheduledPollConsumer.doRun(ScheduledPollConsumer.java:204) [camel-support-4.4.1.jar:4.4.1]
2024-08-29T16:33:08.422549570Z at org.apache.camel.support.ScheduledPollConsumer.run(ScheduledPollConsumer.java:118) [camel-support-4.4.1.jar:4.4.1]
2024-08-29T16:33:08.422552245Z at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
2024-08-29T16:33:08.422554590Z at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) [?:?]
2024-08-29T16:33:08.422556854Z at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) [?:?]
2024-08-29T16:33:08.422559118Z at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
2024-08-29T16:33:08.422561472Z at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
2024-08-29T16:33:08.422563857Z at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
2024-08-29T16:33:08.422566372Z Caused by: com.jcraft.jsch.SftpException: SourcePathNotFound: The source path for a rename operation does not exist. - RequestId:19eb03a6-d01e-0091-0031-fa1172000000 Time:2024-08-29T16:33:08.4124334Z
2024-08-29T16:33:08.422568926Z at com.jcraft.jsch.ChannelSftp.throwStatusError(ChannelSftp.java:2829) ~[jsch-0.2.16.jar:0.2.16]
2024-08-29T16:33:08.422571110Z at com.jcraft.jsch.ChannelSftp.rename(ChannelSftp.java:1898) ~[jsch-0.2.16.jar:0.2.16]
2024-08-29T16:33:08.422573314Z at org.apache.camel.component.file.remote.SftpOperations.renameFile(SftpOperations.java:549) ~[camel-ftp-4.4.1.jar:4.4.1]
2024-08-29T16:33:08.422575729Z ... 14 more
@danhaywood Hmm, we've found this to be unreliable
2024-08-29T16:33:08.422486723Z 2024-08-29 16:33:08.422 WARN 16 --- [.net:22/scanned] o.a.c.c.f.r.SftpConsumer : sftp://xxx@exxx.blob.core.windows.net:22/scanned?delay=300000&initialDelay=300000&knownHostsFile=%2Froot%2F.ssh%2Fknown_hosts&move=..%2FuploadOk&moveFailed=..%2FuploadFailed&password=xxxxxx&preMove=uploadInProgress&strictHostKeyChecking=yes cannot begin processing file: RemoteFile[1010113855.pdf] due to: Cannot rename file from: scanned/1010113855.pdf to: scanned/uploadInProgress/1010113855.pdf. Exchange[]. Caused by: [org.apache.camel.component.file.GenericFileOperationFailedException - Cannot rename file from: scanned/1010113855.pdf to: scanned/uploadInProgress/1010113855.pdf] 2024-08-29T16:33:08.422516449Z 2024-08-29T16:33:08.422520276Z org.apache.camel.component.file.GenericFileOperationFailedException: Cannot rename file from: scanned/1010113855.pdf to: scanned/uploadInProgress/1010113855.pdf 2024-08-29T16:33:08.422523301Z at org.apache.camel.component.file.remote.SftpOperations.renameFile(SftpOperations.java:553) ~[camel-ftp-4.4.1.jar:4.4.1] 2024-08-29T16:33:08.422525706Z at org.apache.camel.component.file.strategy.GenericFileProcessStrategySupport.renameFile(GenericFileProcessStrategySupport.java:146) ~[camel-file-4.4.1.jar:4.4.1] 2024-08-29T16:33:08.422528261Z at org.apache.camel.component.file.strategy.GenericFileRenameProcessStrategy.begin(GenericFileRenameProcessStrategy.java:47) ~[camel-file-4.4.1.jar:4.4.1] 2024-08-29T16:33:08.422530465Z at org.apache.camel.component.file.GenericFileConsumer.processExchange(GenericFileConsumer.java:387) [camel-file-4.4.1.jar:4.4.1] 2024-08-29T16:33:08.422532779Z at org.apache.camel.component.file.remote.RemoteFileConsumer.processExchange(RemoteFileConsumer.java:153) [camel-ftp-4.4.1.jar:4.4.1] 2024-08-29T16:33:08.422535023Z at org.apache.camel.component.file.GenericFileConsumer.processBatch(GenericFileConsumer.java:244) [camel-file-4.4.1.jar:4.4.1] 2024-08-29T16:33:08.422537298Z at org.apache.camel.component.file.GenericFileConsumer.poll(GenericFileConsumer.java:205) [camel-file-4.4.1.jar:4.4.1] 2024-08-29T16:33:08.422539502Z at org.apache.camel.support.ScheduledPollConsumer.doRun(ScheduledPollConsumer.java:204) [camel-support-4.4.1.jar:4.4.1] 2024-08-29T16:33:08.422549570Z at org.apache.camel.support.ScheduledPollConsumer.run(ScheduledPollConsumer.java:118) [camel-support-4.4.1.jar:4.4.1] 2024-08-29T16:33:08.422552245Z at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?] 2024-08-29T16:33:08.422554590Z at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) [?:?] 2024-08-29T16:33:08.422556854Z at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) [?:?] 2024-08-29T16:33:08.422559118Z at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?] 2024-08-29T16:33:08.422561472Z at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?] 2024-08-29T16:33:08.422563857Z at java.base/java.lang.Thread.run(Thread.java:840) [?:?] 2024-08-29T16:33:08.422566372Z Caused by: com.jcraft.jsch.SftpException: SourcePathNotFound: The source path for a rename operation does not exist. - RequestId:19eb03a6-d01e-0091-0031-fa1172000000 Time:2024-08-29T16:33:08.4124334Z 2024-08-29T16:33:08.422568926Z at com.jcraft.jsch.ChannelSftp.throwStatusError(ChannelSftp.java:2829) ~[jsch-0.2.16.jar:0.2.16] 2024-08-29T16:33:08.422571110Z at com.jcraft.jsch.ChannelSftp.rename(ChannelSftp.java:1898) ~[jsch-0.2.16.jar:0.2.16] 2024-08-29T16:33:08.422573314Z at org.apache.camel.component.file.remote.SftpOperations.renameFile(SftpOperations.java:549) ~[camel-ftp-4.4.1.jar:4.4.1] 2024-08-29T16:33:08.422575729Z ... 14 more
We've had some challenges historically due to filesize (processing started before file was completely uploaded), threading and anti-virus locking the file. This was not with Camel though. What happens if you configure a retry in your error handler?
I don't think it'd work - we've had cases where the file has actually disappeared from blob storage completely! Our ops guy tells me that he can understand how that might occur with respect to how Azure emulates sftp on top of blob stores. My guess is that there's some eventual consistency going on, and if one sftp rename command follows on before the previous one has fully completed, then it messes something up, a race condition probably.
Luckily for us we only get a new file to process quite slowly; for now, our workaround when polling for new files is to have a healthy delay in between.
https://nickmck.net/2023/09/02/camel-sftp-azure/