mikeizbicki / cmc-csci143

big data course materials
40 stars 76 forks source link

lab-transactions question #474

Closed danzhechen closed 7 months ago

danzhechen commented 8 months ago

Hi there,

I was trying the lab-transaction. I have two questions.

The first, when I bring the docker down and bring it up now. When I use the command

python3 scripts/create_accounts.py postgresql://postgres:pass@localhost:<PORT> 

There are only 100 accounts after creating not 1000 accounts.

The second, I am not sure whether this relating to the former question. I followed the structure of FOR UPDATE, but my speed is even slower compared to the former questions.

I am not sure whether I am allowed to directly pose my answer here. If it is appropriate, I will post my answer then. But I just follow the structure of

  1. Comment out the LOCK statement that you added in the previous task.

  2. Modify the SELECT statements to use the FOR UPDATE clause.

  3. Wrapping the function in a try/except block, and repeating the failed transfer_funds function call in the except statement.

mikeizbicki commented 8 months ago

Short answer: running git pull should fix both problems.

Long answer:

Several other students ran into the same runtime problem that you also are running into. The problem is that there are 100 accounts created, and 100 transactions happening in parallel in the chaosmonkey_parallel.sh script. That means that a randomly generated transaction will almost certainly conflict with another transaction that is already running.

I fixed this midway through the lab by changing the default number of accounts created in the create_accounts.py script to 1000 instead of 100. Now, because there are many more accounts, a randomly generated transaction is much less likely to block. The insertions therefore actually run in parallel and you get the observed speedup. (I originally used these numbers when I was designing the lab, but I somehow accidentally changed the number of accounts before the lab started, which messed things up.)

Running git pull will update you to the latest version of the create_accounts.py script that will use 1000 accounts. This will likely result in a merge conflict. In that case, you can resolve the conflict manually, or you can just manually edit the create_accounts.py script to use the number 1000 instead of 100.