Closed Arf9999 closed 1 year ago
Update: I've found the issue. It is within the atom_parse
function
If I adjust the following code (lines 32 - 47) of atom_parse.R it works correctly for my usage:
e_link <- xml_find_first(res_entry, glue("{ns_entry}:link")) %>%
xml_attr("href")
# optional
entries <- tibble(
entry_title = safe_run(res_entry, "all", glue("{ns_entry}:title")),
entry_url = safe_run(res_entry, "all", glue("{ns_entry}:id")),
entry_last_updated = safe_run(res_entry, "all", glue("{ns_entry}:updated")),
entry_author = safe_run(res_entry, "all", glue("{ns_entry}:author")),
entry_content = safe_run(res_entry, "all", glue("{ns_entry}:content")),
entry_link = ifelse(!is.null(e_link), e_link, def),
entry_summary = safe_run(res_entry, "all", glue("{ns_entry}:summary")),
entry_category = list(NA),
entry_published = safe_run(res_entry, "all", glue("{ns_entry}:published")),
entry_rights = safe_run(res_entry, "all", glue("{ns_entry}:rights"))
)
to:
# optional
entries <- tibble(
entry_title = safe_run(res_entry, "all", glue("{ns_entry}:title")),
entry_url = safe_run(res_entry, "all", glue("{ns_entry}:id")),
entry_last_updated = safe_run(res_entry, "all", glue("{ns_entry}:updated")),
entry_author = safe_run(res_entry, "all", glue("{ns_entry}:author")),
entry_content = safe_run(res_entry, "all", glue("{ns_entry}:content")),
entry_link = xml_attr(xml_find_first(res_entry, glue("{ns_entry}:link")),"href"),
entry_summary = safe_run(res_entry, "all", glue("{ns_entry}:summary")),
entry_category = list(NA),
entry_published = safe_run(res_entry, "all", glue("{ns_entry}:published")),
entry_rights = safe_run(res_entry, "all", glue("{ns_entry}:rights"))
)
lines 32,33 deleted. line 42 modified.
I'm not sure if this is generalisable, or if it is simply my particular usage.
Thanks Andrew, I'll have a look at this asap.
I've removed an IF statement which I guess was there for a reason, so my mucking about may cause other issues, but it seems ok for me with my particular feed.
I've added this in, thanks Andrew, I will put you down as a contributor, thanks.
Yeah... about that...
I figured out the reason for your initial 'if else'. If there are no entries for a feed, the script fails with an error. Hence the original test for is.null (it was just in the wrong place).
Sorry to do this, but could you replace my bad fix with this good one for line 52 of your revised code:
entry_link = ifelse(!is.null(xml_attr(xml_find_first(res_entry, glue("{ns_entry}:link")),"href")),
xml_attr(xml_find_first(res_entry, glue("{ns_entry}:link")),"href"), def),
This now includes the test for null entries that I cavalierly deleted from your original code.
(I really should do a PR but honestly don't know how to)
Thanks for the package!
Hi Andrew, to do a PR, just fork the repo: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request-from-a-fork
I've just pushed a new version of this to CRAN so this new change will have to wait for a little while, but I'd be happy to incorporate it into the package.
I'm using tidyRSS on a google alert rss feed.
As follows:
All looks good, except...
All entry_link urls are identical. If I check with the google rss result, that isn't the case.
Is this an issue with google or something that I can adjust in the settings?