Closed zzolo closed 11 years ago
Looking into this more, it seems that scraperwiki.sqlite.save is actually wrapping around dumptruck.insert.
https://github.com/scraperwiki/dumptruck/blob/master/dumptruck/dumptruck.py#L214 https://github.com/scraperwiki/scraperwiki_local/blob/master/scraperwiki/sqlite.py#L24
Unfortunately this doesn't handle UPDATEing, instead of INSERTing if the unique keys are found. This is a bit misleading from the "save" name of the method.
I am still confused on why this would happen locally and not on Cobalt or traditional Scraperwiki. I saw that Cobalt is using the same code that I am.
scraperwiki_local is currently a bit of a hack in that it mostly wraps dumptruck rather than acting exactly like scraperwiki on scraperwiki. The scraperwiki database is bit complicated, so I didn't feel like figuring out all of its quirks when I first wrote it.
Anyway, the problem is indeed that DumpTruck's insert does an insert
while scraperwiki does an insert or replace
, and scraperwiki_local should be adjusted to do the same in order to replace scraperwiki exactly. (It's not really obvious from the error message.)
For a week or two DumpTruck, contained the insert or replace
and other scraperwiki compliance things, and then we moved some of those things to scraperwiki. That might explain the difference based on where you ran the script. I'll comment further when I figure out what was wrong or when we fix this bug.
I'm guessing this was fixed by f90bb8b. If not, please re-open this issue.
Hi. I was trying out the new, awesome Cobalt service and ran into the following issue while testing locally:
I simply ran
python scraper.py
and only locally it would do this. While running on Cobalt or in the traditional ScraperWiki web interface it ran fine.For reference, the scraper is: https://box.scraperwiki.com/zzolo/mn-registered-voters
I realize this may be an issue for Dumptruck, but wanted to start here first as it.