Open kaybenleroll opened 5 years ago
Thank you very much for this report! 🙌 I want to acknowledge it and let you know we are aware and looking for a replacement data source to use in the book.
Just to record it here, ideally we would want to find something that:
tidy()
a document-term matrixThis may be too high an ask, though, and we need to break these apart and integrate these two bits of information separately. @dgrtwo
Not at all Julia, happy to help! Let me know if you need any help with this - happy to help out any way I can. That book is really useful and has helped me a lot, so happy to contribute back. :)
Same issue - after a bit of search it looks like the service from Yahoo and Google has been deprecated so probably best remove that bit.
@dgrtwo @juliasilge Do you think it would be better/easier to have a stored Corpus/VCorpus/WebCorpus financial article dataset as part of {tidytext} removing dependencies from other packages. This will enable to demonstrate both of the bullet points you raised.
Thank you very much for this report! 🙌 I want to acknowledge it and let you know we are aware and looking for a replacement data source to use in the book.
Just to record it here, ideally we would want to find something that:
- allows us to demonstrate how to
tidy()
a document-term matrix- is an appropriate use case for the Loughran and McDonald sentiment lexicon
This may be too high an ask, though, and we need to break these apart and integrate these two bits of information separately. @dgrtwo
How about company's earnings call transcripts? I stumbled upon a site that seems to provide these for free: https://news.alphastreet.com/ (Note: I'm not affiliated with them in any way)
The code scraping in section 5.3.1 no longer works as most of the code in the package
tm.plugin.webmining
is not up-to-date.I tried switching the
GoogleFinanceSource
toYahooFinanceSource
but that did not work either.I am sure there are alternatives, but I figured it is best reported here first.