sensiblecodeio / scraperwiki-python

ScraperWiki Python library for scraping and saving data
https://scraperwiki.com
BSD 2-Clause "Simplified" License
160 stars 69 forks source link

Proposal: scraperwiki.sql.variable for getting/setting variables. #35

Open fawkesley opened 11 years ago

fawkesley commented 11 years ago

Basically, like scraperwiki.sqlite.save_var, but cleaned up.

*maybe not in javascript if we can't work out how to do that

fawkesley commented 10 years ago

As an example, the magic table scraper stores its settings in allSettings.json. Other tools / scrapers store their settings in a database table called swvariables (thanks to scraperwiki.sqlite.save_var).

This is an attempt to make a single (decent) way of doing this so we don't keep re-inventing this stuff.

zarino commented 10 years ago

Three interesting decisions I like the sound of:

  1. It's just a dictionary, you can access variables by key rather than as an argument to a function.
  2. It's the same in Python and JavaScript.
  3. It basically has nothing to do with SQLite.

Although that 3rd one does make me wonder why it's under the "scraperwiki.sql" namespace – the fact that it's based on SQL is an implementation detail. Perhaps it should be "scraperwiki.var"?

zarino commented 10 years ago

I've added an issue in the Custard repo for the JavaScript half of this: https://github.com/scraperwiki/custard/issues/400

drj11 commented 10 years ago

It's in the sql namespace because it stores the sqlite store. I don't think it's an implementation detail, I think it's an advertised feature of the interface.

On the issue of it living in .sql or not I'm persuadable.

It should be called .variable, not .var

drj11 commented 10 years ago

@pwaller argues, persuasively IMO, that in Python it should mostly look like property access rather than dictionary access:

scraperwiki.sql.variable.my_favourite_status = "begruddled"

This is so that people are discouraged, by the syntax, from doing programmatic things to the variables (like iterating over all the variables, if people want to do that, they should probably be storing things in a table).

I approve.

obviously in JavaScript land, it's all the same.

frabcus commented 10 years ago

For examples of bad code that results from the lack of this facility, see the Twitter search tool: http://github.com/scraperwiki/twitter-search-tool/

One thing in particular - saving an individual variable without danger of damaging others is very important. i.e. it internally doing an "update ... where" for just the variable.

There's an interesting question about when the variable gets gotton again - i.e. when it does a round trip from Javascript. I'm not sure what the answer is though - needs a bit of discussion.

I've made this a Trello card: https://trello.com/c/kWLodpHG/66-variable-saving-loading-for-tools-simpler