montera34 / pageonex

PageOneX. Analyzing front pages
http://pageonex.com
GNU Affero General Public License v3.0
54 stars 13 forks source link

When the list of media is updated: rake scraping:kiosko_names_csv should update not create #106

Closed numeroteca closed 11 years ago

numeroteca commented 11 years ago

The kiosko_names_csv method at /lib/tasks/scraping.rake fills the media table everytime rake scraping:kiosko_names_csv is run. It should update and not create new rows in the table. The last updates added url to the media list 92850a1.

Related to #105

rahulbot commented 11 years ago

Now run rake scraping:update_media['public/kiosko_scraped.csv'] to update the database from the csv.

Run rake scraping:scrape_media to update the csv by scraping Kiosko (code in lib/scrapers.rb)

numeroteca commented 11 years ago

I tried in a fresh install running rake scraping:update_media['public/kiosko_scraped.csv'], and it didn't work. It freezes and it doesn't return anything.

rahulbot commented 11 years ago

Did you do a db:migrate?

Rahul

Sent from my phone

On Mar 28, 2013, at 6:55 PM, numeroteca notifications@github.com wrote:

I tried in a fresh install running rake scraping:update_media['public/kiosko_scraped.csv'], and it didn't work. It freezes ad it doesn't return anything.

— Reply to this email directly or view it on GitHub.

numeroteca commented 11 years ago

Yes, I db:migrate successfully.

numeroteca commented 11 years ago

Ups, sorry, I checked again: the rake scraping:update_media['public/kiosko_scraped.csv'] works and loads the media in the table of the database. It was just that the process was slow and that it doesn't output any message.

We should add something like a "puts update finished".

numeroteca commented 11 years ago

Every time rake scraping:update_media['public/kiosko_scraped.csv'] is run, it loads new rows in the media table from the database.