RobertMyles / tidyRSS

An R package for extracting 'tidy' data frames from RSS, Atom and JSON feeds
https://robertmyles.github.io/tidyRSS/
Other
82 stars 20 forks source link

403 error #79

Closed HugoGit39 closed 4 months ago

HugoGit39 commented 4 months ago

When run this code I get a 403 error...any idea how to solve this?

feed <- "https://cointelegraph.com/rss"

safe_get <- function(feed, user = NULL, config = list()) {
  safeget <- safely(GET)
  req <- safeget(feed, user, config)

  if (!is.null(req$error)) {
    msg <- paste0("Attempt to fetch feed resulted in an error: ", req$error)
    stop(msg)
  }
  status <- req$result$status_code
  if (status != 200L) {
    stop("Attempt to get feed was unsuccessful (non-200 response). Feed may not be available.")
  } else {
    message("GET request successful. Parsing...\n")
  }
  result <- req$result #nocov
  return(result) # nocov
}
RobertMyles commented 4 months ago

Hi Hugo,

I'm not sure what you're trying to do running that particular piece of code...if you want the feed, it works normally with the following:

tidyRSS::tidyfeed("https://cointelegraph.com/rss")
GET request successful. Parsing...

# A tibble: 30 × 15
   feed_title            feed_link feed_description feed_language feed_pub_date       feed_last_build_date feed_category
   <chr>                 <chr>     <chr>            <chr>         <dttm>              <dttm>               <chr>        
 1 Cointelegraph.com Ne… https://… Cointelegraph c… "\n         … 2024-04-22 09:06:45 2024-04-22 09:23:20  Staking      
 2 Cointelegraph.com Ne… https://… Cointelegraph c… "\n         … 2024-04-22 09:06:45 2024-04-22 09:23:20  Staking      
 3 Cointelegraph.com Ne… https://… Cointelegraph c… "\n         … 2024-04-22 09:06:45 2024-04-22 09:23:20  Staking      
 4 Cointelegraph.com Ne… https://… Cointelegraph c… "\n         … 2024-04-22 09:06:45 2024-04-22 09:23:20  Staking      
 5 Cointelegraph.com Ne… https://… Cointelegraph c… "\n         … 2024-04-22 09:06:45 2024-04-22 09:23:20  Staking      
 6 Cointelegraph.com Ne… https://… Cointelegraph c… "\n         … 2024-04-22 09:06:45 2024-04-22 09:23:20  Staking      
 7 Cointelegraph.com Ne… https://… Cointelegraph c… "\n         … 2024-04-22 09:06:45 2024-04-22 09:23:20  Staking      
 8 Cointelegraph.com Ne… https://… Cointelegraph c… "\n         … 2024-04-22 09:06:45 2024-04-22 09:23:20  Staking      
 9 Cointelegraph.com Ne… https://… Cointelegraph c… "\n         … 2024-04-22 09:06:45 2024-04-22 09:23:20  Staking      
10 Cointelegraph.com Ne… https://… Cointelegraph c… "\n         … 2024-04-22 09:06:45 2024-04-22 09:23:20  Staking      
# ℹ 20 more rows
# ℹ 8 more variables: feed_generator <chr>, item_title <chr>, item_link <chr>, item_description <chr>,
#   item_pub_date <dttm>, item_guid <chr>, item_enclosure <list>, item_category <list>
# ℹ Use `print(n = ...)` to see more rows

Created on 2024-04-22 with reprex v2.0.2

HugoGit39 commented 4 months ago

Hi Robert

That piece of code of safe_get gets the error:

image

I run tidyRSS on a VM on Digital Ocean btw...it might get blocked somehow cause the rss feed is protected for anti-scraping? It is quite far fetched tbh.....other rss feeds do work on the VM.

RobertMyles commented 4 months ago

Hi Hugo, I can only guess that it is a network issue on your side, from my local machine it's working fine. Have you tried running it elsewhere?