awslabs / aws-greengrass-labs-s3-file-uploader

Apache License 2.0
20 stars 8 forks source link

The last file is not uploaded until new file comes #4

Open sakh251 opened 1 year ago

sakh251 commented 1 year ago

Hey Thank you for your code,

I tested it and I can deploy and run it on my Greengrass device. But I noticed that the current file (last file) is not uploaded until a new file comes. It means always one file is in the queue and when a new file comes it is uploaded. But again the new file is in the queue.

I checked your code, but I did not notice the bug yet.

023-03-24T14:28:02.747Z [WARN] (Copier) aws.greengrass.labs.s3.file.uploader: stderr. INFO:root:The current active file is : /home/FtpResults/test/ppppppppppppppppppppppp. {scriptName=services.aws.greengrass.labs.s3.file.uploader.lifecycle.Run, serviceName=aws.greengrass.labs.s3.file.uploader, currentState=RUNNING} 2023-03-24T14:28:02.747Z [WARN] (Copier) aws.greengrass.labs.s3.file.uploader: stderr. INFO:root:No new files to transfer. {scriptName=services.aws.greengrass.labs.s3.file.uploader.lifecycle.Run, serviceName=aws.greengrass.labs.s3.file.uploader, currentState=RUNNING}

Best Regards

Cyril-Lagrange commented 1 year ago

Hey,

The behavior you described is the expected behavior as described in the README.md file. The issue is to detect when the process that produces the files has stopped writing to a file. The convention I took is that the most recent file that matches the pattern you have specified is the current file and so might still be written by the producing process. An option could be added to the componenent so that this behavior is changed, but then the writer would have first to write the file to a temporary file with a name that doesn't match the pattern and than when writing is complete rename the file. Line 120 in file DirectoryUploader.py would have to be modified for this, to not remove the latest file from the list of files to upload. I'll see if I can find some time to do this in the coming week.