Closed arjugane closed 2 months ago
Hi @arjugane
Hi @debora-ito, Thanks for responding.
1) I do not see "Bytes transferred..." log getting to 100% ❌ Note : Today for 35gigs file, I can see "Bytes transferred..." showed 25.10% and then directly "Transfer complete!"
2) Yes, for example, file which was half-downloaded at some point gets succeeded over manual retry again. Note : This time, I can see "Bytes transferred..." log getting to 100% ✅
3) Yes, within S3 bucket, these .mov are very large files under a specific directory. Also there are other .mp4/*.mov files which are varying in size ( MBs, GBs )
Example structure below ( sample s3 key for each file )
someDir/someDir2/ABCD/pkgName-1/ABCD_150.mp4 ( 49 MB )
someDir/someDir2/ABCD/pkgName-1/ABCD_1200.mp4 ( 384 MB )
someDir/someDir2/ABCD/pkgName-1/ABCD_3800.mp4 ( 1.2 GB )
someDir/someDir2/ABCD/pkgName-2/ABCD.mp4 ( 360 MB )
someDir/someDir2/ABCD/pkgName-3/ABCD.xml ( 1 KB )
someDir/someDir2/ABCD/pkgName-3/ABCD.json ( 5 KB )
someDir/someDir2/ABCD/pkgName-4/ABCD.mxf ( 25 GB )
someDir/someDir2/ABCD/pkgName-5/ABCD_HD.ts ( 1.3 GB )
someDir/someDir2/ABCD/pkgName-5/ABCD_SD.ts ( 860 MB )
someDir/someDir2/ABCD/pkgName-6/ABCD.mov ( 82 GB )
For example, for the very large files case, directoryDownload will try to only download large file ( it is usually single file referring to pkgName-6/ABCD.mov
in the structure example above )
I mean, directoryDownload is not combined with other packageDirs & its files.
4) I will try to enable trace logs during this week. Thank you for the instructions.
Note : I've deployed following AWS SDK and AWS CRT versions in prod -
<aws-crt.version>0.30.8</aws-crt.version>
<aws-java-sdk.version>2.27.10</aws-java-sdk.version>
Furthermore, until the summer this year, this micro-service was using AWS SDK 1.x with Java 17. If we rollback to this version, of course, we never bump into the above described problem ( as it didn't have / use S3AsyncClient + S3TransferManager ).
Recently I bumped the service with AWS SDK 2.x with Java 21 ( S3AsyncClient + S3TransferManager ) using AWS CRT builder.
Oh I see, the issue is that the transfer listener doesn't print "100%", it jumps from some percentage directly to "Transfer complete!".
I have some follow-up questions:
.downloadFileRequestTransformer(builder -> builder.addTransferListener(
S3TransferListener.builder()
.id(id)
.destinationPath(dir)
.s3URI(s3URI)
.fileRecords(fileRecords)
.build()))
S3TransferListener
is your custom implementation? Is it possible that the issue is in this class? Just to set expectations, we won't debug custom classesSure, @debora-ito ,
1) Yes, currently S3TransferListener
is custom implementation that implements TransferListener
class ( AWS SDK 2.x)
Note : We have only enabled logs in this custom implementation for transferInitiated
, transferComplete
& transferFailed
interface methods
Lately, since large files are getting not completely downloaded, I added bytesTransferred
to understand what's going on during directoryDownload ( i.e, only one large file within that dir as explained earlier ) as there are no warning or errors ( related to process getting interrupted , etc. )
2) Sure, of course, I could switch to calling LoggingTransferListener.create()
instead of the custom S3TransferListener
I will try this during this week.
3) Never! In the failed cases, I could say file is never completely downloaded because after the downloadDirectory
operation, I've added simple ffmpeg implementation (more info at https://www.ffmpeg.org/) that checks if the file is complete or truncated.
Note : You see this ffmpeg file check is a temporary solution within our WF ( so that we can manually retry the failed job if file is not completely downloaded)
Also, it is surprising that SDK never throws any warning or error at all 😢
@arjugane let us know what you find with the CRT trace logging and if you see the same issue when using LoggingTransferListener
.
This issue is now closed. Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one.
Describe the bug
Using S3AsyncClient.crtBuilder(), TransferListner bytesTransferred does not complete to 100% during directoryDownload operation.
Expected Behavior
TransferListener should work as expected during downloadDirectory operation even for large files
Current Behavior
Problem statement: Using S3AsyncClient.crtBuilder(), TransferListner bytesTransferred does not complete to 100% during directoryDownload. But it says Transfer Complete
Use case: Downloading very large files ( eg: 37 gigs, 67gigs, 80 gigs, and even more than 100+ gigs )
For one particular case ( 83 gigs file ), I saw following logs ( in sequence ) :- 1) Transfer initiated... 2) Bytes transferred... ( 25.10 percentage ) 3) Bytes transferred... ( 50.08 percentage ) 4) Transfer complete! Note : this is the case where file that was downloaded was incomplete
There is neither warning nor error with respect to FailedTransfers, etc during directoryDownload. Hence, this has been very hard to understand and suspect where the underlying problem is..?
Note that the issue has been very intermittent here.
Reproduction Steps
Using following crtBuilder() builder -
Using following directoryDownload builder -
Possible Solution
No response
Additional Information/Context
No response
AWS Java SDK version used
2.26.15
JDK version used
21-latest
Operating System and version
linux/alpine