ContentMine / quickscrape

A scraping command line tool for the modern web
MIT License
259 stars 42 forks source link

How to address non-attribute content in quickscrape #57

Open petermr opened 9 years ago

petermr commented 9 years ago

If the landing page contains:

<title>Foo bar</title>

how do I extract the element value using:

"title": {
"selector": "//title",
},

(i.e. I want an analogue to attribute)

blahah commented 9 years ago

Attribute takes two special values, text (the default) and html. See the scraperJSON README https://github.com/ContentMine/scraperJSON/blob/master/README.md