RobertMyles / tidyRSS

An R package for extracting 'tidy' data frames from RSS, Atom and JSON feeds
https://robertmyles.github.io/tidyRSS/
Other
82 stars 20 forks source link

Strip out common unwanted characters #48

Closed RobertMyles closed 4 years ago

RobertMyles commented 4 years ago

I.e. \n\n\n\n\n\n// Define the...

RobertMyles commented 4 years ago

Before:

> tidyfeed("http://abigailsee.com/feed.xml")
GET request successful. Parsing...

# A tibble: 5 x 7
  feed_link   feed_pub_date       item_title      item_link         item_description             item_pub_date       item_guid        
  <chr>       <dttm>              <chr>           <chr>             <chr>                        <dttm>              <chr>            
1 http://abi… 2019-08-21 18:54:45 What makes a g… http://abigailse… "\nThis blog post is about … 2019-08-12 23:00:00 http://abigailse…
2 http://abi… 2019-08-21 18:54:45 Deep Learning,… http://abigailse… "\n\n  \n     (and video pl… 2018-02-20 23:00:00 http://abigailse…
3 http://abi… 2019-08-21 18:54:45 Four deep lear… http://abigailse… "\n\n\n\n\n\n// Define the … 2017-08-29 23:00:00 http://abigailse…
4 http://abi… 2019-08-21 18:54:45 Four deep lear… http://abigailse… "\n\n\n\n\n\n// Define the … 2017-08-29 23:00:00 http://abigailse…
5 http://abi… 2019-08-21 18:54:45 Taming Recurre… http://abigailse… "\nThis blog post is about … 2017-04-15 23:00:00 http://abigailse…

After:

> tidyfeed("http://abigailsee.com/feed.xml")
GET request successful. Parsing...

# A tibble: 5 x 7
  feed_link   feed_pub_date       item_title      item_link         item_description             item_pub_date       item_guid        
  <chr>       <dttm>              <chr>           <chr>             <chr>                        <dttm>              <chr>            
1 http://abi… 2019-08-21 18:54:45 What makes a g… http://abigailse… "This blog post is about th… 2019-08-12 23:00:00 http://abigailse…
2 http://abi… 2019-08-21 18:54:45 Deep Learning,… http://abigailse… "(and video player) will re… 2018-02-20 23:00:00 http://abigailse…
3 http://abi… 2019-08-21 18:54:45 Four deep lear… http://abigailse… "// Define the div for the … 2017-08-29 23:00:00 http://abigailse…
4 http://abi… 2019-08-21 18:54:45 Four deep lear… http://abigailse… "// Define the div for the … 2017-08-29 23:00:00 http://abigailse…
5 http://abi… 2019-08-21 18:54:45 Taming Recurre… http://abigailse… "This blog post is about th… 2017-04-15 23:00:00 http://abigailse…