Open candeira opened 8 years ago
Hi,
I'm not sure it is a good idea to include bulk_create into this project. Django already has built-in bulk_create method. Why not separate the objects into create and update, then use bulk_create and bulk_update explicitly?
@candeira I'm way into that! I could use this on my project, for sure.
@aykut bulk_update_or_create
is different from bulk_create
?
@candeira @aykut @ckcollab this would be amazingly helpful
@aykut the problem with doing bulk_create is that you need to know in advance which ones exist already - so requires an additional query i think
I need this feature. Any news?
I think it's possible to add this feature. I would call it bulk_update_or_create
because django already has a update_or_create
for single instances.
But even if we implement this function here, we will also need to know which instances already exist (performing an additional query). bulk_update_or_create
will actually split the list of instances and call bulk_create
and bulk_update
separately. So each batch will perform 3 queries.
Seems reasonable for you? Any better approach?
It's seems reasonable. Although both postgres and mysql now suport bulk upsert: https://stackoverflow.com/questions/34514457/bulk-insert-update-if-on-conflict-bulk-upsert-on-postgres https://stackoverflow.com/questions/6286452/mysql-bulk-insert-or-update
I do agree with @arnau126, Any update about this feature.
The 3 query approach is a race condition; unless you can be sure your program is the only one writing to that table you'll have to add retry logic around the transaction (as records can get added and removed between your read and create step).
SQL level UPSERT is the way to go for atomic single query update/create.
For my current job we need bulk upsert of records, and I'm thinking of forking your package and implementing bulk_upsert myself. If/when I do that, I'd like to do it in the manner that's most likely to be accepted into your project, so as not to maintain an independent fork.
Which syntax do you prefer?
For now I'd only make my changes compatible with Postgres 9.5+, because that's what we're using and because I'm relatively new at this niche.
Any other advice/comment?