pudo / dataset

Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.
https://dataset.readthedocs.org/
MIT License
4.76k stars 297 forks source link

Add default replacement keys parameter for insert and upsert methods #375

Open dshepelev15 opened 3 years ago

dshepelev15 commented 3 years ago

Hi! I've implemented the ability to use default values of the keys parameter for insert* & upsert* methods. This task is related with suggestion Can you please check the code?

pudo commented 3 years ago

This is both an interesting idea and a really nicely done PR. One thing that I'm concerned about is making this less of a usability risk. The current implementation would probably just overwrite the entire table if no unique constraint is defined. That should be an exception, probably. But I kind of wonder if an even cleaner solution would be to make the unique behaviour opt-in, perhaps via some magic value? Or we could expose the unique keys as a property on table?

if len(table.unique_columns):
    table.upsert(data, table.unique_columns)

This is just a bit more explicit and might avoid users overwriting databases when they misunderstand the API

(Note: we probably want to warn on len 0 keys in any case?)

dshepelev15 commented 3 years ago

Thank you for your comments. Yes, I agree with you - "Explicit is better than implicit" :) I'll add the check on zero-length of keys and the property of the table for unique columns. But I don't understand what do you mean by some magic value? How is it supposed to work

pudo commented 3 years ago

Thanks for making these adaptations! I just meant that always passing the table.unique_columns into insert and update as keys might be a much more explicit behaviour than defaulting to it quietly. It's a little more code text, but very very readable. My main concern with this is that defaulting to unique columns is a very subtle behaviour change in the most used function of this library - something that would require basically announcing a breaking change :/

dshepelev15 commented 3 years ago

I've got your concern. Let's add the ability to pass a parameter (for instance use_unique_columns=False) into the constructor of Table class for more clarity.

I am also thinking about usage of primary keys here instead of unique keys. Whad do you think about that? I saw the ticket about multiple primary keys for Table, then It will be more useful, won't be?