nrabinowitz / pjscrape

A web-scraping framework written in Javascript, using PhantomJS and jQuery
http://nrabinowitz.github.io/pjscrape/
MIT License
997 stars 159 forks source link

Reference error _.pjs #29

Closed amalhotra closed 11 years ago

amalhotra commented 11 years ago
pjs.addSuite({
    url: 'http://en.wikipedia.org/wiki/List_of_towns_in_Vermont',
    scraper: function() {
        return $('#sortable_table_id_0 tr').slice(1).map(function() {
            var name = $('td:nth-child(2)', this).text(),
                county = $('td:nth-child(3)', this).text(),
                // convert relative URLs to absolute
                link = _pjs.toFullUrl(
                    $('td:nth-child(2) a', this).attr('href')
                );
            return {
                model: "myapp.town",
                fields: {
                    name: name,
                    county: county,
                    link: link
                }
            }
        }).toArray(); // don't forget .toArray() if you're using .map()
    }
});

an example on the tutorials does not work anymore. I cant reference _pjs within the map function above. I get "ReferenceError: Can't find variable: _pjs" on the command line.

UPDATE: Ok, non issue, can be closed. Hope this helps somebody.

I was running my scraper as: $ phantomjs pjscrape.js myscraper.js

instead you want all the contents of the zip/tar ball you download in the path

$phantomjs pjscrape/pjscrape.js myscraper.js // Here pjscrape dir has all files from the zip or tar ball