rladies / meetupr

R interface to the meetup.com API
https://rladies.github.io/meetupr
MIT License
76 stars 25 forks source link

Bug in internals.R? #55

Closed benubah closed 3 years ago

benubah commented 5 years ago

My work using this package to explore R User groups based on a modified version of find_groups() (provided within this pull request https://github.com/rladies/meetupr/pull/48 ) that retrieves additional fields has revealed an unexpected behavior.

Consider the code below:

all_ruser_groups <- find_groups(text = "r-project-for-statistical-computing", fields = "past_event_count, upcoming_event_count, last_event, topics", api_key = meetup_api_key)

There are 481 records returned for this query. For records, between 0 - 199, everything is fine. But for records between 200 - 481, there are no entries for upcoming_event_count, last_event, topics. Only the first additional field past_event_count is returned.

I guessed that the problem could be from the following portion of .fetch_results() function within internals.R :

# If you have not yet retrieved all records, calculate the # of remaining calls required
  extra_calls <- ifelse(
    (length(records) < total_records) & !is.null(res$headers$link),
    floor(total_records/length(records)),
    0)
  if (extra_calls > 0) {
    all_records <- list(records)
    for (i in seq(extra_calls)) {
      # Keep making API requests with an increasing offset value until you get all the records
      # TO DO: clean this strsplit up or replace with regex
      next_url <- strsplit(strsplit(res$headers$link, split = "<")[[1]][2], split = ">")[[1]][1]
      res <- .quick_fetch(next_url, api_key, event_status)
      all_records[[i + 1]] <- res$result
    }
    records <- unlist(all_records, recursive = FALSE)
  }

I re-wrote this portion and got a working solution for my work. Possibly could send a pull request soon.

@ledell @LucyMcGowan Any thoughts?

ledell commented 5 years ago

@benubah Can you describe what the issue with the code is and how you solved it? (It's hard to tell without looking at a git diff.)

Also want to make sure you are aware of this issue and re-factoring that's happening (or will happen): https://github.com/rladies/meetupr/issues/51

benubah commented 5 years ago

@ledell My guess on the issue with the code is that it does not add the offset parameter to further API calls for records that are more than 200. When the records are beyond 200, they are returned in other pages. To access the records in those pages, one has to set offset to a number. In the current code, I don't see a direct inclusion of offset when extra calls are made.

For example: R user groups on meetup.com are above 450, so I have to query the API 3 times with different parameters for offset to see all those records.

# first call, offset is automatically set to 0 by the API. Returns 200 records
https://api.meetup.com/find/groups?radius=global&text=r-project-for-statistical-computing&topic_id=98380&key=xxxxxxx

# second call returns another 200 records
https://api.meetup.com/find/groups?radius=global&text=r-project-for-statistical-computing&topic_id=98380&key=xxxxxxx&offset=1

# third call returns the remaining records
https://api.meetup.com/find/groups?radius=global&text=r-project-for-statistical-computing&topic_id=98380&key=xxxxxxx&offset=2

Highlight on how I am including offset that increments in each extra call:

offsetn <- ceiling(total_records/length(records))
for(i in 1:(offsetn - 1)) {
        res <- .quick_fetch(api_url = api_url,
                            api_key = api_key,
                            event_status = event_status,
                            offset = i,
                            ...)
        all_records[[i + 1]] <- res$result
      }

I submitted a pull request with my changes: https://github.com/rladies/meetupr/pull/57

Yes, I am aware of the ongoing re-factoring based on OAUTH authentication. I have been watching to see the direction things will take as my work depends largely on this package.

Thank you.

ledell commented 5 years ago

@benubah The .quick_fetch() function is designed to do a single fetch, where as the .fetch_results() function is meant to do the subsequent calls (if needed). I thought this code already worked for multiple calls? Are you saying that .fetch_results() does not work as intended? https://github.com/rladies/meetupr/blob/master/R/internals.R#L55

benubah commented 5 years ago

@ledell Yes, .fetch_results() works fine for multiple subsequent calls (if needed), but it shows an unexpected behavior when called from the newly proposed find_groups() in https://github.com/rladies/meetupr/pull/48

This unexpected behavior occurs when the additional fields parameters are more than one. To be clearer, please see the following code:

# this is the modified find_groups() in my pull request that allows requesting additional fields
# please note that I am requesting four additional data attributes using fields
all_ruser_groups <- find_groups(text = "r-project-for-statistical-computing", fields = "past_event_count, upcoming_event_count,  last_event, topics", api_key = meetup_api_key)

# for 1 - 200, record parameters are fully returned - default and additional
all_ruser_groups$resource[[1]]$last_event
all_ruser_groups$resource[[100]]$topics
all_ruser_groups$resource[[200]]$last_event
# returns the correct values

# For 201 - 482, additional parameters in `fields` are not fully returned, but default parameters are returned.
all_ruser_groups$resource[[201]]$last_event
all_ruser_groups$resource[[482]]$topics
all_ruser_groups$resource[[482]]$upcoming_event_count
# they all return NULL, but they have values at meetup

This meant to me that .fetch_results() did not work as intended in this case. I therefore modified https://github.com/rladies/meetupr/blob/master/R/internals.R#L55 to https://github.com/rladies/meetupr/pull/57/commits/ec0943b3753431d5de986ab9f4f14b8e55ddf470#diff-1b62d1c1d07f967b6ebe9bbd20df1348R57

This returned all additional fields completely.

The data obtained via additional fields in find_groups() is vital to my work, because I am trying to dig a little deep into the exploration of R User Groups and R-Ladies Groups.

benubah commented 5 years ago

@ledell Just to show how useful the additional fields are to my project, please see the screenshot below. We can see the Last Event date, calculate Months Inactive, Past and Upcoming Event Counts right in the pop-ups on the map. We can apply different colors to Active, Inactive and Unbegun groups. This map was possible with the currently available .fetch_results() because R-Ladies Groups are less than 200. I cannot retrieve all these extra parameters anymore for all 482 R User Groups. I can only retrieve them for the first 200 R User Groups. However, this challenge appears to resolve with the modification of https://github.com/rladies/meetupr/pull/57

image

Thank you

ledell commented 5 years ago

@benubah See my comments here: https://github.com/rladies/meetupr/pull/57#issuecomment-508526538

ledell commented 4 years ago

Hi @benubah Do you know if this is still an issue? I know there were some internal changes after the OAuth upgrade, so I am wondering if this has been resolved.

benubah commented 4 years ago

Hi @ledell , I believe it will only be an issue when https://github.com/rladies/meetupr/pull/48 is merged into the codebase. This means that, it is not an issue if find_topics() remains the way it is at the moment.

benubah commented 4 years ago

@ledell I have added the required modifications to care of this issue in this PR https://github.com/rladies/meetupr/pull/48

maelle commented 3 years ago

48 has been merged so should this issue be closed? I am working on OAuth stuff and looking at open issues.

maelle commented 3 years ago

@ledell @benubah is this issue still valid? or should it be closed?

benubah commented 3 years ago

Feel free to close.

maelle commented 3 years ago

thank you!