kangasbros / django-bitcoin

bitcoin payment management for django
MIT License
179 stars 107 forks source link

receiving_address() raises DatabaseError with postgresql #23

Open pomali opened 10 years ago

pomali commented 10 years ago

I follow this tutorial: http://opensourcehacker.com/2013/10/16/accepting-and-spending-bitcoins-in-a-django-application/

Everything is ok if database is sqlite (or mysql) but when I use postgresql (psycopg2) and I try to do master_wallet.receiving_address(fresh=False) it returns

"DatabaseError: SELECT FOR UPDATE/SHARE cannot be applied to the nullable side of an outer join"

It is problem in new_bitcoin_address() (django_bitcoin/models.py) around line 118:

updated = BitcoinAddress.objects.select_for_update().filter(Q(id=bp.id) & Q(active=False) & Q(wallet__isnull=True) & \
Q(least_received__lte=0)).update(active=True)

Query it tries to run looks like this:

u'UPDATE "django_bitcoin_bitcoinaddress" SET "active" = %s WHERE "django_bitcoin_bitcoinaddress"."id" IN (SELECT U0."id" FROM "django_bitcoin_bitcoinaddress" U0 LEFT OUTER JOIN "django_bitcoin_wallet" U1 ON (U0."wallet_id" = U1."id") WHERE (U0."id" = %s  AND U0."active" = %s  AND U1."id" IS NULL AND U0."least_received" <= %s ) FOR UPDATE)'

But I'm not sure what to do with this, why query looks like this or what should it do. Maybe I am doing something wrong, maybe django ORM is translating it not the way it should, or the ORM query is not what you expected it to be.

If you know how to fix this, I would love to use django-bitcoin in my project.

readevalprint commented 10 years ago

Yeah this just hit me with postgres too. This email explains what it means: http://www.postgresql.org/message-id/21634.1160151923@sss.pgh.pa.us

a lock on a select result means that you've guaranteed that no one else can change the rows you selected. In an outer join it's impossible to guarantee that --- someone could insert a B row that matches a formerly unmatched A row. If you now re-did the SELECT you would get a different result, ie, your null-extended A row would be replaced by a normal row, even though you had lock on that A row. (This does not speak to the question of new rows showing up in the second SELECT --- that's always possible. The point is that a row you got the first time is now different despite being "locked".)

devinjames commented 10 years ago

The issue is that wallet__isnull=True is attempting to lock the null wallet rows and the other statements are trying to lock the addresses table, the thing is postgres can't guarantee that for the transaction the wallet result will stay constant, as far as postgres knows some other transaction could run in parallel and change an entry in the wallet table making it's old statement invalid

It's not a perfect solution but if you need a workaround for now you can add a distributed lock on the new_bitcoin_address to ensure that the function doesn't run in parallel any two times in the system:

@distributedlock('get_new_address') def new_bitcoin_address():

Update this line in the 'new_bitcoin_address' function: updated = BitcoinAddress.objects.select_for_update().filter(id=bp.id, active=False, least_received__lte=0, wallet__isnull=True).update(active=True) with updated = BitcoinAddress.objects.select_for_update().filter(id=bp.id).update(active=True)

The criteria are already set by bp.id, so long as nothing has utilized that address within the same timeframe of this occurrence of the function there will be no issues with duplicity in the database. That is the reason for the distributed lock, ensuring that the function cannot run in parallel within your app.

For the same reason you also have to modify the call in the Wallet.receiving_address method.

This updated = BitcoinAddress.objects.select_for_update().filter(id=addr.id, active=True, least_received__lte=0, wallet__isnull=True).update(active=True, wallet=self) becomes updated = BitcoinAddress.objects.select_for_update().filter(id=addr.id, active=True).update(wallet=self)

The addr variable at this point should direct towards an address id, there is no reason to run the rest of the criteria since it's been executed already, arguably active=True isn't necessary at this point, as it should have been set in the new_bitcoin_address call. There is also no reason for the active=True portion in the update() call at the end of the function since it was already set in the new_bitcoin_address

I hope that helps, there has to be a better way but unfortunately with the way it is built right now it'll take some design overhauls or a manual select statement that doesn't use an outer join, unfortunately I am not an SQL genius :(.