The use of checksums in Neebla and SyncServerII

crspybits commented 2 years ago

In getting ready to add support for Solid to Neebla and SyncServerII the question of the kind of use of checksums has come up. I'm opening this issue to have a place to record the specifics of how checksums are currently used.

crspybits commented 2 years ago

Neebla: The client app itself doesn't directly use checksum's.

iOSBasics: (The SyncServer API library) 1) On v0 file uploads, the server endpoint UploadFileRequest takes a checksum as input. The client computes the checksum from the file contents about to be uploaded.

2) On file downloads, the server endpoint DownloadFileResponse provides a checksum so the client can recompute the checksum locally and compare it to the checksum coming back from the server. It also receives a contentsChanged value computed on the server, based on checksums (see below).

Server: 1) On file uploads:

a) The checksum is used as part of a heuristic check, prior to uploading to cloud storage, for a duplicate v0 upload. i.e., the checksum coming up from the client request is compared to the lastUploadedCheckSum for a v0 file.

b) When handling a v0 upload, after uploading the file to the specific cloud storage, the checksum obtained from the specific cloud storage system upload is compared to that obtained from the client.

c) This checksum is saved in a server database table (lastUploadedCheckSum).

2) Applying mutable file changes

a) After applying changes, and uploading them, the resulting checksum from cloud storage is stored back again in lastUploadedCheckSum.

3) On file downloads:

a) A checksum is obtained from the cloud storage system as part of the download.

b) The lastUploadedCheckSum is compared with the checksum obtained from the cloud storage system download to test if the file was changed at rest (or possibly, during transmission). E.g., this can happen if a user goes in directly and modifies their file. This "contentsChanged" is passed back to the client along with the download data.

crspybits commented 2 years ago

Kjetil Kjernsmo @kjetilk Sep 08 09:14 While RFC7232 is a requirement for Solid, it just says " An origin server SHOULD send an ETag for any selected representation for which detection of changes can be reasonably and consistently determined" (https://gitter.im/solid/specification)

crspybits commented 2 years ago

So, this looks pretty much like a show stopper for checksums for Solid for reliably getting something like them from the Solid servers.

kjetilk commented 2 years ago

There is another route that you can go, though. We are thinking about an extensible way to make "auxiliary resources", in which the server could write checksums. This would give protection against in-flight attacks, and some protection against some attacks on the server I suppose, but it wouldn't protect against a potentially malicious server... For that, we would need a pretty sophisticated key management system, since the access control system is very granular, it is difficult.

crspybits commented 2 years ago

I'd definitely like to hear more, as this progresses, thank you! My use case might be easier, I'm not sure. I'm more focused on data integrity than attacks for the time being.

crspybits commented 2 years ago

Alexander James Phillips @AJamesPhillips 10:44 Does anyone know where the docs for etag are?

Jeff Zucker @jeff-zucker 10:49 @AJamesPhillips - https://solidproject.org/TR/protocol#writing-resources and indirectly in the RFCs mentioned at https://solidproject.org/TR/protocol#http in regard to conditional requests

Alexander James Phillips @AJamesPhillips 10:55 Thank you @jeff-zucker ! Ok so it does not look like it's officially part of the spec / clients yet.

Jeff Zucker @jeff-zucker 10:59 You should ask that of someone on the spec panel but my understanding is the E-Tags are MUST but that strong E-Tags are a MAY AFAIK all solid servers send etags in the headers

AJamesPhillips commented 2 years ago

crspybits commented 2 years ago

https://forum.solidproject.org/t/etag-best-practice/2062

SyncServerII / Neebla

The use of checksums in Neebla and SyncServerII #28