sensiblecodeio / scraperwiki-python

ScraperWiki Python library for scraping and saving data
https://scraperwiki.com
BSD 2-Clause "Simplified" License
160 stars 69 forks source link

sql.save has weird functionality with existing data #92

Open Ar-Kaine opened 8 years ago

Ar-Kaine commented 8 years ago

The sql.save function is a useful quick way of saving data. However, it has very odd functionality if you try to build data incrementally from various sources.

The use case I'm thinking of is this: I have two data sets with different fields but a common key. One has been saved already to the SQLite database in scraper wiki.

Using the common key as the unique field with the sql.save function does not behave as I would expect. if the key already exists in the data set, I would expect the function to replace the data in any matching columns, and create new columns (if they don't already exist) an input the rest of the data, leaving the existing data unchanged if there is no matching column for it.

The actual behaviour is that it wipes all populated fields for a given row by setting them to null if the column was not in the list of columns for the sql.save command.

The expected behaviour would be better, as when working with multiple data sets it becomes frustrating to have to use embedded SQL code to manage updates.