matthewmueller / node-nom

Dead simple site scrapper for Node.js
74 stars 2 forks source link

nom cli fails to fetch meta #2

Closed hemanth closed 10 years ago

hemanth commented 11 years ago

nom url 'meta[http-equiv="refresh"]' does not fetch anything even though the url has few many meta tags, but nom url 'head' works fine.

buschtoens commented 10 years ago

Does nom url meta work? I could imagine that the dash in http-equiv causes the bug.

buschtoens commented 10 years ago

nom url meta does not work aswell. But this is probably an issue of cheerio.

hemanth commented 10 years ago

@silvinci does not seem like a cheerio issue, but more of a CLI issue @MatthewMueller help!

buschtoens commented 10 years ago

Can't see a problem in the cli source. Have you tried $.html("meta");?

hemanth commented 10 years ago

@silvinci That works in the script, but how do you do the same on the CLI?

nom url head works but nom url head > meta nor nom url meta works...

buschtoens commented 10 years ago

This is interesting since the selector gets directly passed.

So basically, if it doesn't work in the cli, it should not work in the script. Just can't test it right now.

hemanth commented 10 years ago

@silvinci You are right. There seems to be no return values from html() or text() for meta.

Something as simple as :

if(selector === "meta"){
   console.log($(selector));
}

Is doing the trick for me, but this is just a silly hack, need to paw at cheerio and check why html() is failing for this.

buschtoens commented 10 years ago

So, it's a cheerio issue. Cross-link it here and close? :)

hemanth commented 10 years ago

@silvinci Cya there ;)