justinludwig / jpgpj

Java Pretty Good Privacy Jig
MIT License
74 stars 20 forks source link

Is the Encryptor synchronous or asynchronous? #14

Closed amorillas closed 6 years ago

amorillas commented 6 years ago

I'm working in an application that encrypts files with a symmetric passphrase and AES256 and in the next line it uploads the encrypted file to a external repository (let's say Amazon S3). I've been getting errors that some encrypted files (specifically big files) don't exist and therefore can't be uploaded.

I thought that the encryptor was synchronous, even I've been searching in the code and I haven't found anything that can tell me that it is asynchronous, but I can't imagine anything else that can cause the issue. I've put a control code that works fine waiting for the encrypted file to be ready:

// Encryption code long totalTimeWaiting = 0; while(!fileToUpload.exists() && totalTimeWaiting<=MAX_WAIT) { long timeToWait = TIME_TO_WAIT; Thread.sleep(timeToWait); totalTimeWaiting += timeToWait; } // Upload code

I don't like this kind of code, but it's the best solution I've found so far. If the encryptor is meant to be synchronous, sorry but it seems it is not. If it's asnychronous, I would like to know if there's a better approach for waiting for the file to be encrypted.

Thanks in advance!

justinludwig commented 6 years ago

That sounds like a pretty gnarly issue to debug :(. Encryption/decryption is synchronous in JPGPJ, however. My best guess would be some sort of filesystem caching or latency -- especially if the filesystem is NFS or something similar, it might take a little while for a large file to be available for reading right after it has been written.

You probably can't avoid the extra wait code if that's the case, but writing the file in larger chunks might help avoid waiting as much. The Encryptor.encrypt(File, File) method reads and writes files in 4K chunks -- if you're hitting this issue mainly with large files, it might help to increase that buffer quite a bit -- say to 512K, or maybe even larger, like several megabytes. Unfortunately, right now that 4K size is hardcoded -- but you could fairly easily duplicate the functionality of Encryptor.encrypt(File, File) method in your own code -- it just opens up the files as streams (and extracts the file metadata from the input file), and then calls the Encryptor.encrypt(InputStream, OutputStream, FileMetadata) method:

https://github.com/justinludwig/jpgpj/blob/959220314f529a6940d31a41c5efbd014b5b3225/src/main/java/org/c02e/jpgpj/Encryptor.java#L257

amorillas commented 6 years ago

Thank you @justinludwig, it seems you're right, knowing that indeed the encrypt process is synchronous.

I can perform some tests in my application with a bigger buffer copying your code, but I think it could be better to include this option in jpgpj, would you accept a Pull Request with another method where users can specify the buffer size? This would let them choose what buffer size would fit their needs. In my case I'm working with big and little files and maybe I can adjust the buffer size on the fly for each file to improve performance.

Going further, another implementation could be giving a maximum buffer size (say 512KB) and let jpgpj choose the buffer size according to the file size. For example, for files up to 512KB, the buffer size would be the file size and for files with size more than 512KB would be the specified maximum size. This approach would optimize memory use and IO performance.

justinludwig commented 6 years ago

I like those ideas! I think the best way to structure it would be to add a property to the encryptor and decryptor objects that allowed the max buffer size for the file input/output streams used by the Encryptor.encrypt(File, File) and Decryptor.decrypt(File, File) methods to be configured (rather than adding another version of those methods).

justinludwig commented 6 years ago

Thanks again for the PR! -- I hope it's helped with your use-case. I'm going to close this issue now (but feel free to re-open if you have additional thoughts or questions about it).