Closed aruizga7 closed 9 years ago
With the google news rss api, you can only get the most current news items (max 100 per request). However, you can filter the retrieved news items by e.g. date afterwards:
require(tm.plugin.webmining)
require(tm)
corpusGoog <- WebCorpus(GoogleNewsSource(params=list(hl = "en", q = searchTerm, ie = "utf-8", num = 10, output = "rss")))
# Filter corpus for news items greater than Feb 15th:
filter <- sapply(corpusGoog, function(x) meta(x, "datetimestamp") >= as.POSIXct("2015-02-15"))
corpusGoogFilter <- corpusGoog[filter]
# Sort corpus by datetimestamp
corpusorder <- order(sapply(corpusGoog, function(x) as.POSIXct(meta(x, "datetimestamp"))))
corpusGoogSort <- corpusGoog[corpusorder]
I am running the following code:
How can I get only news of a specific date or sort by date?