wiscostret / fplscrapR

This package enables those interested in Fantasy Premier League to perform detailed data analysis of the game, using the FPL's JSON API. The fplscrapR functions help R users collect and parse data from the Official Fantasy Premier League website.
Creative Commons Zero v1.0 Universal
76 stars 16 forks source link

Obtain specific page for get_league_entries() #28

Open thomaszwagerman opened 2 years ago

thomaszwagerman commented 2 years ago

Thank you for this great package!

Apologies in advance for a long issue, but I have a bit of a niche request for get_league_entries() and hopefully it is helpful!

Basically I'd like to obtain the amount of points needed for a given overall rank. Because each page's length is 50, it's quite easy to work out that for the overall standings page 2000 is required for rank 100, 000 (100000 / 50).

# Get page 200
df_ranks <- get_league_entries(leagueid = 314, pages = 2000)
# Takes a long time

df_rank_10000 <- df_ranks %>%
  dplyr::filter(rank_sort == 100000) %>%
  dplyr::select(rank = rank_sort, total)

Because get_league_entries() loops for (i in 1:pages), all pages are always obtained, even though we're only interested in page 2000. This can take a long time to run, especially if I'm interest in rank 1 million, for example.

So, I'm proposing to add some functionality to get_league_entries(), by being able to specify a page. Some limitations I tried to stick to:

I think the best approach is to add a logical argument, specificpage, which defaults to FALSE. When set to TRUE by the user it will obtain that page only. As far as I can tell, this works without breaking current usage:

get_league_entries <- function(leagueid = NULL,
                               leaguetype = "classic",
                               pages = 1,
                               specificpage = FALSE){
  if(is.null(leagueid)) stop("You'll need to input a league ID, mate.")
  if(length(leagueid) != 1) stop("One league at a time, please.")
  if(is.list(pages) & isFALSE(specificpage)) stop("Can only supply a list if specificpage == TRUE")
  if(!is.list(pages)) if(pages %% 1 != 0) stop(
    "The number of pages needs to be a whole number, or a list of numbers when specificpage == TRUE."
  )
  if(!is.logical(specificpage)) stop("specificpage can only be TRUE/FALSE")

  {
    entries <- data.frame()
    if(specificpage == FALSE) {
      for (i in 1:pages){

        standings <- jsonlite::fromJSON(
          paste(
            "https://fantasy.premierleague.com/api/leagues-",
            leaguetype,
            "/",
            leagueid,
            "/standings/?page_standings=",
            i,
            sep = "")
        )

        entries <- rbind(entries, standings$standings$results)

      }
    } else if(specificpage == TRUE) {
      for(i in pages) {
        standings <- jsonlite::fromJSON(
          paste(
            "https://fantasy.premierleague.com/api/leagues-",
            leaguetype,
            "/",
            leagueid,
            "/standings/?page_standings=",
            i,
            sep = "")
        )

        entries <- rbind(entries,standings$standings$results)
      }
    }
    return(entries)
  }
}

So as an example, if I want to know the points required for 1, 100, 1000, 10000:

ranks_of_interest <- c(1, 100, 1000, 10000)

list_of_pages <- ceiling(ranks_of_interest/50) %>%
  as.list()

ranks <- get_league_entries(leagueid = 314,
                            pages = list_of_pages,
                            specificpage = TRUE)

ranks <- ranks %>%
  dplyr::filter(rank_sort %in% ranks_of_interest) %>%
  dplyr::select(rank = rank_sort, total)

# In Gameweek 6:
> ranks
    rank total
1      1   492
2    100   459
3   1000   444
4  10000   428
5 100000   408

Which leads us nicely into a new function:

get_points_for_rank <- function(ranks_of_interest) {
  list_of_pages <- ceiling(ranks_of_interest/50) %>%
    as.list()

  ranks <- get_league_entries(leagueid = 314,
                              pages = list_of_pages,
                              specificpage = TRUE)

  ranks <- ranks %>%
    dplyr::filter(rank_sort %in% ranks_of_interest) %>%
    dplyr::select(rank = rank_sort, total)
}

I'll open a PR for you to review, and whether this is functionality you'd like to add this to get_league_entries() or if it should be a new function altogether.

I don't think this breaks anything, but you will know much better than I do. Happy to make changes and go with what you think is best.