Open jvillafanez opened 3 days ago
Use read/write mutex. The
Get
method would block until thePut
method has finished.
- Note that we need to use https://pkg.go.dev/sync#RWMutex to prevent
Get
s from locking each other.- We might also want to use locks based on the target key / file instead of a global lock.
This might not work. There might be multiple thumbnails services running all using the same shared storage for the generated thumbnails.
This should work without any locking:
In the worst case (multiple clienst requesting the same thumbnail before the first completed) we would generate the thumbnail multiple times. But we would never return partial or empty content.
This might not work. There might be multiple thumbnails services running all using the same shared storage for the generated thumbnails.
Yes, you're right. I usually forget about the replicas. Maybe we should consider to use distributed locks at some point.
This should work without any locking:
Write the complete thumbnail to a tempfile then rename it to the final name. (At least for Posix compliant filesystems rename is atomic)
In the worst case (multiple clients requesting the same thumbnail before the first completed) we would generate the thumbnail multiple times. But we would never return partial or empty content.
Looks good to me :+1:
Describe the bug
The issue might happen when we're getting a thumbnails from the storage at the same time the thumbnails is being generated. Target code is around https://github.com/owncloud/ocis/blob/master/services/thumbnails/pkg/thumbnail/storage/filesystem.go#L44-L77
The sequence of actions is more or less the following:
os.Create
call at https://github.com/owncloud/ocis/blob/master/services/thumbnails/pkg/thumbnail/storage/filesystem.go#L65)Get
method is executed (https://github.com/owncloud/ocis/blob/master/services/thumbnails/pkg/thumbnail/storage/filesystem.go#L44) so we return the 0 byte file that was created in the previous step.Put
method finishes later, writing the thumbnail in the file.There might be other variations because we might not guarantee that the
file.Write
method is atomic, so there could be multiple write operations in order to save the thumbnail. This could mean that partial data could be retrieved by theGet
method.Steps to reproduce
No easy way to reproduce the issue. The race condition happens when getting the thumbnails at the same time it's being created. 1. 5. 6.
Expected behavior
Either we get an error because the thumbnail is missing (or not created yet), or we get the whole thumbnail.
Actual behavior
A race condition might happen, so we could send 0 bytes or partial data.
This might happen only while the thumbnail is created, or more exactly, when the data is being saved in the FS. The thumbnails are expected to have small size, so the time window for the race condition to happen is expected to be very small.
Setup
Please describe how you started the server and provide a list of relevant environment variables or configuration files.
```console OCIS_XXX=somevalue OCIS_YYY=somevalue PROXY_XXX=somevalue ```
Additional context
We have some options to tackle this issue:
Get
method would block until thePut
method has finished.Get
s from locking each other.