nirbarazida / Data-mining-project

ITC - Data mining project
0 stars 1 forks source link

Integrity check skips users #14

Open InbarShirizly opened 4 years ago

InbarShirizly commented 4 years ago

We can see in the pictures that users that should be entered regularly to the database were skipped

image

image

InbarShirizly commented 4 years ago

The problem is: when we are working in the multi-processing mode, several processes queries the database for location. Because we don't insert the location immediately, there is a high chance that both of the queries will not find the location and create a location to add. The location table forced the country to be unique and then it fails over and over.

I have ideas about how to solve this:

  1. Try to commit the location immediately, (means another commit) - and to allow the database to have duplicate records in the country for the worse scenario
  2. Use mutex (or another Python library) to lock this part of the code - this will probably solve the problem but will decrease the efficiency of the code (because processes are waiting and not working all the time concurrently). In the other hand, no chance for duplicates and it could help to manage queries to the database from many machines in the future

image

image

a presentation that presents the problem in general: https://slideplayer.com/slide/4962647/