Open verschmelzen opened 3 days ago
Here is showcase
from threading import local, Thread
# isolated
print('Isolated, so we will see errors')
l = local()
l.a = 1 # is not available to threads
def print_l():
try:
print('try access to non-existent attribute', l.a)
except:
print('isolation works')
import traceback
traceback.print_exc()
l.a = 2 # is not available to "next" thread
t = Thread(target=print_l)
t.start()
t.join()
t2 = Thread(target=print_l)
t2.start()
t2.join()
print('\n')
print('\n')
# shared
print('Value shared between threads')
l = local()
class B: pass
A = l.a = B()
A.a = 1
def print_A():
print('state is shared', A.a)
A.a = 2 # new value is available to "next" thread
t = Thread(target=print_A)
t.start()
t.join()
t2 = Thread(target=print_A)
t2.start()
t2.join()
@verschmelzen Thx for finding this one - yes you are right, it currently does not decouple data across threads correctly. The issue is line 31-34, were the dict objects get bound back into to module level sharing all stuff across all threads.
A quickfix would be to reshape the data parts and put them behind a guarded access, something like this:
STORAGE = local()
def get_DATA():
try:
return STORAGE.DATA
except AttributeError:
STORAGE.DATA = {'DELETES': {}, 'M2M_REMOVE': {}, 'M2M_CLEAR': {}, 'UPDATE_OLD': {}}
return STORAGE.DATA
# access in data handlers:
# instead of
DELETES...
# it is now:
get_DATA()['DELETES']
(No clue, if the dict abstraction is good enough or if real attributes on STORAGE would be better here ...)
Do you want to write a fix for it?
Just benchmarked it - fully decoupling is actually the fastest here:
STORAGE = local()
def get_DELETES():
try:
return STORAGE.DELETES
except AttributeError:
STORAGE.DELETES = {}
return STORAGE.DELETES
def get_M2M_REMOVE():
try:
return STORAGE.M2M_REMOVE
except AttributeError:
STORAGE.M2M_REMOVE = {}
return STORAGE.M2M_REMOVE
...
Which seems plausible, as it only cares for the datapoint actually needed.
Edit: Fixed SHARED vs. STORAGE naming. :sweat_smile:
There might be a serious bug here. Afaiu
local()
is that resolves to the correct context when it's attribute is accessed from inside the running thread. Here you access the members possibly before other threads fork, thus sharing all the data between them.https://github.com/netzkolchose/django-computedfields/blob/master/computedfields/handlers.py#L31