RobertMyles / tidyRSS

An R package for extracting 'tidy' data frames from RSS, Atom and JSON feeds
https://robertmyles.github.io/tidyRSS/
Other
82 stars 20 forks source link

Error for feeds with no item #63

Closed chainsawriot closed 1 year ago

chainsawriot commented 2 years ago

In this example, the RSS is valid but with no item in it. It raises an uninformative error. For this case, I would expect an empty tibble.

require(tidyRSS)
#> Loading required package: tidyRSS
tidyfeed("https://www.kn-online.de/arc/outboundfeeds/rss/tags_slug/kiel-restaurants/")
#> GET request successful. Parsing...
#> Error in df[[get("listcol")]][[i]]: subscript out of bounds

Created on 2022-08-12 by the reprex package (v2.0.1)

RobertMyles commented 1 year ago

Looking at this again, @chainsawriot , I think feeds like this are either uncommon or shouldn't be the target of parsing, so I don't want to customise tidyRSS to handle something like this.

chainsawriot commented 1 year ago

@RobertMyles Fine.

For this project, we check around 1000 RSS every few hours and around 10% of them are having this error. So, feeds like this are actually not that uncommon at least in our application. It's an edge case for sure. But I think it would be better to tell users this is an edge case. Those are valid feeds; if the "contract" in the documentation is hold, it should return an empty tibble / list.

https://github.com/RobertMyles/tidyRSS/blob/1a4c5fc954ea9994f3f67a465f132e2edfa75681/R/tidyfeed.R#L18-L20

Even if no tibble is gonna returned, a more informative error message (e.g. "No parsable item in this feed") is helpful even when it raises an error.

If you want that, I can submit a PR.

RobertMyles commented 1 year ago

Sure, I hadn't realised feeds like this are that common, did really seem like an edge case to me and I'm reluctant to try and adapt the package to all the messed-up feeds that can be found in the wild (I tried that before). Happy to receive a PR if you'd like to do that, I may not get around to doing this myself for a while.