readium / readium-lcp-server

Repository for the Readium LCP Server
BSD 3-Clause "New" or "Revised" License
73 stars 58 forks source link

Publication encrypted file gets deleted by accident when using fs storage #322

Closed ivstiv closed 1 year ago

ivstiv commented 1 year ago

I noticed that when I try to upload an epub file through the frontend it disappears after a second. When I stopped the lcpserver the file stopped disappearing, so I dug deeper to understand whether it is a faulty configuration or a bug. Here is what I found going backwards from the issue to the cause:

Upon confirmation of a publication the frontend server pings this lcpserver endpoint and somehow the publication arrives with Storage_none value. This in terms calls cleanupTempFile() to clean up the actual file. Kind of weird given that we just saved it on the other side but ok lets run with it. https://github.com/readium/readium-lcp-server/blob/ea09f7a59d4faa3b602e86aebd734f46a2a148da/lcpserver/api/store.go#L132-L150

If we take a step backwards and find where the frontend server called the lcpserver we end up here. Nothing concerning right? We encrypt the upload and ping the other side with the publication data. https://github.com/readium/readium-lcp-server/blob/ea09f7a59d4faa3b602e86aebd734f46a2a148da/frontend/webpublication/webpublication.go#L107-L124

That ProcessEncryption looks peculiar with so many empty values. Lets actually look into what the encryption process looks like. And wooow look at that we actually set the StorageMode of the publication to Storage_none by default which also happens to be the only place in the whole project where this happens! https://github.com/readium/readium-lcp-server/blob/ea09f7a59d4faa3b602e86aebd734f46a2a148da/encrypt/process_encrypt.go#L69-L89

Ok but how come the storageRepo is empty? I am confident I set that in the yaml config files! Lets jump a step backwards and see what we passed again to ProcessEncryption. Ahh yes the 6th parameter storageRepo seems to have a hardcoded empty value which would always lead the frontend server to upload & encrypt a publication, pass that publication with StorageMode set to Storage_none only, so that the file of the publication gets deleted by the lcp server by accident. https://github.com/readium/readium-lcp-server/blob/ea09f7a59d4faa3b602e86aebd734f46a2a148da/encrypt/process_encrypt.go#L29

Is this what the FIXME comment refers to? https://github.com/readium/readium-lcp-server/blob/ea09f7a59d4faa3b602e86aebd734f46a2a148da/frontend/webpublication/webpublication.go#L110

I guess for now I can move over to S3 storage instead? If my theory is right I might be able to open a PR to fix it, just let me know.

llemeurfr commented 1 year ago

Initial remark: we developed the Frontend Test Server some years ago as a test & demo tool; it was developed quickly and I must say badly, therefore it must never be used in production. Only for tests!

The Frontend Test Server embeds the encryption process, which is also used in the lcpencrypt command line tools. This process is used by the frontend in a very specific way, which explains the number of empty values. The frontend does not specify a (final) storage repository, it requests the encryption process to put the output file in a temp folder; this is why Storage_none is set.

When the frontend notifies the License Server, it passes a path to the encrypted file (in the temp folder). The License Server takes this file and moves it to the target folder (which must be accessible from the Web so that content can be downloaded by a client app). The target folder must therefore be set in the License Server configuration file via the "storage" section described in https://github.com/readium/readium-lcp-server.

So yes, the file is deleted from the temp folder but should appear in the target folder. Isn't it the case?

Note: we'll simplify the complex process in the next major revision of the LCP Server codebase.

ivstiv commented 1 year ago

I understand now, yes so my problem was that the testfrontend and the lcpserver had the same directory specified for encrypted files. It does make sense for the behaviour to delete the file now once I understand all of the moving pieces but initially It seemed intuitive to me for the 2 services to point to the same directory for encrypted content.

You can close this issue unless you want to use it as reference in a future set of changes. Thank you for the response!

llemeurfr commented 1 year ago

Thanks for your message; you're welcome. I'll close the issue now.