Closed babolivier closed 1 year ago
So even though the test is passing, it's currently not working when used in manual testing due to Twisted messing up with things, resulting in:
Traceback (most recent call last):
File "/home/babolivier/Documents/matrix/mcs-python-final/matrix_content_scanner/servlets/__init__.py", line 78, in _async_render
code, response = await method_handler(request)
File "/home/babolivier/Documents/matrix/mcs-python-final/matrix_content_scanner/servlets/scan.py", line 44, in on_GET
await self._scanner.scan_file(media_path)
File "/home/babolivier/Documents/matrix/mcs-python-final/matrix_content_scanner/scanner/scanner.py", line 164, in scan_file
return await self._current_scans[cache_key]
RuntimeError: await wasn't used with future
After discussing in #synapse-dev, we came to the conclusion that it would probably be best to just get rid of Twisted altogether and use a more asyncio-friendly web framework (since we exclusively use Twisted as such) like aiohttp instead. This PR is therefore on hold till this change has happened.
I've run some manual testing of this branch with the changes from #39, and can confirm it works now that Twisted is out of the equation!
Fixes #23
As of #19, we cache (most) scan results in memory to avoid having to download and scan the same file twice (with the caveat that we don't cache the contents of big files, but it still saves scanning them). However, it does not prevent duplicate work if multiple requests for the same file happen simultaneously, since we only update this cache when a scan completes. This is a legitimate concern, since we can easily imagine a situation where a user posts a large file into a room and multiple users try to download it at the same time.
As of this change,
Future
that we store into a new_current_scans
cache, and run the scan. Once the scan completes, we resolve theFuture
with the result, and remove it from the cache.Future
from the_current_scans
cache andawait
it. We then return its result.