Closed kaylinland closed 3 years ago
I'm having trouble reproducing this: https://voyant-tools.org/spyral/rss-test/ Could it be that the CBC's RSS feed was down at the time?
No, I think the issue was with using loadCorpus without the summary method. It is working now--thanks!
I'm still having issues making this function work:
loadCorpus("http://www.cbc.ca/cmlink/rss-topstories", {
inputFormat: 'xml', // force XML (not RSS)
xmlContentXpath: "//item/description" // grab item description for content
});
Maybe there is an issue with the XML path, but it looks okay to me.
I just had a quick look, but it seems to work if you don't specify inputFormat (not currently sure why).
loadCorpus('https://www.cbc.ca/cmlink/rss-topstories', {
xmlContentXpath: "//item/description"
}).summary()
gives me:
This corpus (86f8e166b6650ff7c869f484c76f4a17) has 20 documents with 1,870 total words and 711 unique word forms.
That's interesting! I got it to work with the addition of the .summary() option--you need the inputFormat because the idea is to turn it from a corpus with 20 separate documents to one single document.
loadCorpus('https://www.cbc.ca/cmlink/rss-topstories', {
inputFormat: 'xml', //force xml
xmlContentXpath: "//item/description" //define xpath
}).summary()
This does the job!
Ok I misunderstood what you wanted to do (re: single document). You shouldn't have to call summary() in order for loadCorpus to work by the way. Please have a look at the newly edited version of https://voyant-tools.org/spyral/rss-test/
Spyral does not recognize URL with RSS (https://www.cbc.ca/cmlink/rss-topstories) with loadCorpus function. The error reads: An error occurred during multi-threaded document expansion.