sbsdev / daisyproducer2

An integrated production management system for accessible media
GNU Affero General Public License v3.0
0 stars 0 forks source link

Race condition in `unknown/get-words` #87

Closed egli closed 3 years ago

egli commented 3 years ago

daisyproducer2.words.unknown/get-words first deletes the whole table dictionary_unknownword, inserts all words from the document and finally does a join with dictionary_localword and dictionary_globalword.

If another process happens to invoke get-words at the same time you're bound to run into trouble.

Any of the following might work as a solution:

  1. run all statements inside a transaction
  2. make the delete aware of document_id
  3. use a temporary table
egli commented 3 years ago

The first solution would imply locking of the entire dictionary_unknownword table (as we delete all records at the beginning of the request). Don't know if we want that.

The second solution would require to drop the sort-of lazy approach to emptying the dictionary_unknownword table. We'd have to clean up after ourselves right within the request.

egli commented 3 years ago

This doesn't work as currently implemented. We also need to change the query to get the unknown words.

egli commented 3 years ago

This doesn't work as currently implemented. We also need to change the query to get the unknown words.

And the change is quite substantial, as the query is huge to begin with and the additional AND unknown.document_id = :document-id would have to be added in numerous places.

It seems much simpler to just use transaction isolation and add a simple with-transaction around the deletion and select