Open beruic opened 5 years ago
This issue seems similar to this one from the original repository: https://github.com/michiya/django-pyodbc-azure/issues/143
At least the hot-fix solution if the same: Chunk requests in chunks of 2000 and swallow the bad performance :)
Here a workaround :
new_values = ','.join(new_values)
MyModel.objects.extra(where=[f"my_field in ( select * FROM SPLIT_STRING('{new_values}'),',')"]).exists()
Basically what it does it send a comma seperated value and then resplit it to use it in a query.
Don't event need SPLIT_STRING...
new_values = str(new_values).replace( '[' , '(' ).replace( ']' ,')' )
MyModel.objects.extra(where=[f"my_field in {new_values}"])
The issue that I have with my previous solutions is that it only works on a case by case basis and it will not work if we use prefetch_related with more that 2100 related objects.
To fix this , create a lookup.py next to your models.py with :
from django.db.models.fields import Field from django.db.models.lookups import In
@Field.register_lookup
class In(In):
lookup_name = 'in'
def as_sql(self, compiler, connection):
max_in_list_size = 2100
if self.rhs_is_direct_value() and max_in_list_size and len(self.rhs) > max_in_list_size:
return self.split_parameter_list_as_sql(compiler, connection)
return super().as_sql(compiler, connection)
def split_parameter_list_as_sql(self, compiler, connection):
lhs, lhs_params = self.process_lhs(compiler, connection)
rhs, rhs_params = self.batch_process_rhs(compiler, connection)
in_clause_elements = f"( {lhs} IN ( SELECT * FROM SPLIT_STRING(%s)))"
params = [','.join(map(str,rhs_params)) ]
return in_clause_elements, params
make sure to import in your models.py file :
from .lookups import In
and voilà it will works!
Split_string was added in SQL 2016, for previous version this function works for me : https://stackoverflow.com/questions/10914576/t-sql-split-string
@etiennepouliot I'm trying your solution, but despite importing it into models.py, those functions in the IN lookup don't seem to be ran when I call prefetch_related. Is this not meant to work for that?
I'm using graphene django with gql_optimizer, which results in one giant query with 25000+ parameters
A solution has been implemented on the dev branch of the Microsoft fork. See my issue on that repo for details.
The solution uses the Microsoft recommended solution to large parameter lists by creating a TEMP TABLE and joining over that.
@etiennepouliot I'm trying your solution, but despite importing it into models.py, those functions in the IN lookup don't seem to be ran when I call prefetch_related. Is this not meant to work for that?
I'm using graphene django with gql_optimizer, which results in one giant query with 25000+ parameters
Indeed it's not working for prefetch_related this way. I ended up editing directly django/db/models/lookups.py in my project as I didn't know how to overwrite this file globally.
If anybody know how, I would appreciate. Maybe I need to dig this deeper.
I have a product where around 300k instances are created in an import.
Before importing (which I fear may fail as well), I perform an existence check against a single field in the form
MyModel.objects.filter(my_field__in=new_values).exists()
wherenew_values
is a set of strings.Here is the stack trace of the issue: