RobertMyles / tidyRSS

An R package for extracting 'tidy' data frames from RSS, Atom and JSON feeds
https://robertmyles.github.io/tidyRSS/
Other
82 stars 20 forks source link

entry_link is same for all items #55

Closed bryanwhiting closed 3 years ago

bryanwhiting commented 3 years ago

This might be an issue with the feed itself, but the entry_link is the same for every entry (despite title and content being accurate):

tidyfeed('http://feeds.feedburner.com/GDBcode') %>%
  count(entry_link)

1 http://developers.googleblog.com/feeds/6345491852200823648/comments/default    25

# The same is true for these other two feeds
tidyfeed('http://feeds.feedburner.com/blogspot/gJZg') %>% count(entry_link)
tidyfeed('http://feeds.feedburner.com/GoogleOpenSourceBlog') %>% count(entry_link)

yet I have other feedburner sites that don't have this problem:

tidyfeed('http://feeds.feedburner.com/ProfessorRobJHyndman') %>% count(item_link)
tidyfeed("http://feeds.feedburner.com/kdnuggets-data-mining-analytics") %>% count(item_link)

side question: why does the Google feed use "entry*" rather than "item*"?

Thanks for your help!

RobertMyles commented 3 years ago

Hi Bryan, I'll have a look at that as soon as I can. As for your side question: no idea!

RobertMyles commented 3 years ago

Hi Bryan, No idea why this happens with Google feeds...must be something in the feed itself. I don't see the raw XML of it there either, so I think this is just something with the structure of these Google feeds themselves.