ruipgil / scraperjs

A complete and versatile web scraper.
MIT License
3.71k stars 188 forks source link

Scraper function will not work unless it matches regex pattern #62

Open Alanz2223 opened 8 years ago

Alanz2223 commented 8 years ago

I spent a few hours tracking this one down. I encountered a 'cannot read property error of undefined' when I supplied a scrape function into the .scrape promise. I followed everything step by step and wrote the code myself so that I can memorize the flow. It didn't work. Then I pasted the exact same procedure from the examples folder and sure enough the thing ran without a problem. What happened was that the function you supply is turned into a string and then manipulated to extract the whatever is inside the block. This is done through a regex to match the enclosing 'function(){..}' stuff but the regex doesn't work unless there is a space between the closing parentheses and the starting bracket.

so this

.scrape( function($) { .. .... } )

will work just fine, but this

.scrape( function($){ .. .... } )

throws the null error.

a very subtle difference but it can give you headaches.... Anyways I hope someone who is better with regex expressions can fix this.

the current exp is

var rg = /^function\s+([a-zA-Z$][a-zA-Z$0-9])?((.?)) {/g;

within the PhantomWrapper.js file