xwikisas / application-antivirus

Keeps your XWiki instance safe by scanning file attachments for viruses or malware.
GNU Lesser General Public License v2.1
0 stars 1 forks source link

Restart the scan job when the wiki server restarts during the scan #26

Closed AndreeaChi closed 2 years ago

AndreeaChi commented 3 years ago

At the moment, if the wiki server restarts during the scan job (which can take even 20 hours to complete), the scan process does not restart and the email report at the end does not get sent anymore.

The proposal is to add a failure recovery process, such as the restart of the scan job if the wiki server restarts during the scan.

From @Enygma2002 , there are the following suggestions: This would be a new feature that could by done at 2 levels of complexity: 1.1. simplistic - just restart from 0 - save a flag of "scheduled scan running" inside an object of a document and unset it when the scan is done. At wiki startup, a listener can check if such a flag is still set (i.e. was not set by the proper finish of a scan) and, if so, restart the scheduled scan, hoping that it will finish this time. 1.2. advanced - resume from where it was interrupted - rework the way the scan is made (wiki > document > attachment) and replace it with a fixed and sorted query that lists all attachments. Then, additional to the simplistic flag above, the AntivirusJob would also read a wiki ID and attachment query index (i.e. "query offset") from which to "resume" (i.e. not restart from scratch) the scanning of attachments. This approach would entirely focus on attachments and might even be a bit more performant than the previous one, as it may get rid of queries on documents that have no attachments.

The "focus on attachments" improvement from 1.1 above could be done even purely with the objective of improving performance. Otherwise, handling 1.2 would also ensure that the report sending step is always reached inside the AntivirusJob.

If approach 1.1 is selected (recommended), 2 extra lists would need to be stored with the resume index/offset information: the list of detected infections (i.e. deleted attachments) so far and the list of failed deletions, both of which are needed for the report.