Closed boussouf closed 4 years ago
Thanks for the bug report, @boussouf . That's an interesting problem, see here . I've been using anytime()
to parse dates, whereas before v2 I had used lubridate. I might go back to lubridate, although this would have failed under previous versions of tidyRSS too.
library(xml2)
library(httr)
library(magrittr)
library(anytime)
rss <- "https://www.valdemarne.fr/rss.xml"
GET(rss) %>%
read_xml() %>%
xml_find_all("channel") %>%
xml_find_all("item") %>%
xml_find_all("pubDate") %>%
xml_text()
#> [1] "Vendredi, 13 Mars, 2020 - 12:22" "Lundi, 24 Février, 2020 - 15:42"
#> [3] "Vendredi, 21 Février, 2020 - 14:17" "Vendredi, 21 Février, 2020 - 11:45"
#> [5] "Mardi, 18 Février, 2020 - 11:00" "Vendredi, 14 Février, 2020 - 13:22"
#> [7] "Mardi, 11 Février, 2020 - 16:24" "Mardi, 11 Février, 2020 - 16:01"
#> [9] "Mardi, 11 Février, 2020 - 14:20" "Mardi, 11 Février, 2020 - 11:40"
GET(rss) %>%
read_xml() %>%
xml_find_all("channel") %>%
xml_find_all("item") %>%
xml_find_all("pubDate") %>%
xml_text() %>%
anytime()
#> [1] NA NA NA NA NA NA NA NA NA NA
GET(rss) %>%
read_xml() %>%
xml_find_all("channel") %>%
xml_find_all("item") %>%
xml_find_all("pubDate") %>%
xml_text() %>%
lubridate::parse_date_time("dmy hm", locale = "fr_FR.UTF-8")
#> Warning: hms, hm and ms usage is deprecated, please use HMS, HM or MS instead.
#> Deprecated in version '1.5.6'.
#> [1] "2020-03-13 12:22:00 UTC" "2020-02-24 15:42:00 UTC"
#> [3] "2020-02-21 14:17:00 UTC" "2020-02-21 11:45:00 UTC"
#> [5] "2020-02-18 11:00:00 UTC" "2020-02-14 13:22:00 UTC"
#> [7] "2020-02-11 16:24:00 UTC" "2020-02-11 16:01:00 UTC"
#> [9] "2020-02-11 14:20:00 UTC" "2020-02-11 11:40:00 UTC"
Created on 2020-02-26 by the reprex package (v0.3.0)
For the moment, I don't have a quick fix for this, though I'll have something in version 2.0.1. That will be here on GH in the next few weeks but will take a while to get to CRAN as I don't want to spam them with releases.
Tracking this here: https://github.com/RobertMyles/tidyRSS/projects/3
Hello,
pubDate in French format are sometimes not read by the tidyRSS function. unfortunatly, the return column is NA, so we lose this information.
Example: View(tidyfeed("https://www.valdemarne.fr/rss.xml"))
Thank you