pablobarbera / Rfacebook

Dev version of Rfacebook package: Access to Facebook API via R
http://cran.r-project.org/web/packages/Rfacebook
350 stars 250 forks source link

`getPage()` with `feed = TRUE` not returning posts that `feed = FALSE` does #105

Closed clente closed 7 years ago

clente commented 7 years ago

I've been using this package for a couple of days and it is truly amazing, but I encountered a small problem when running getPage: not every post returned with feed = FALSE also gets returned with feed = TRUE.

I was getting the posts from multiple public pages and found this happened with one of them. This was the smallest reproducible example I could manufacture.

# Get posts from page (only from page admin)
posts <- getPage(
  "panvelfarmacias", token, 1000,
  '2017/01/01', '2017/01/31'
  ) %>%
  select(id, from_id, from_name)

# Get posts from page (everyone)
posts_feed <- getPage(
  "panvelfarmacias", token, 1000,
  '2017/01/01', '2017/01/31',
  feed = TRUE) %>%
  select(id, from_id, from_name)

# There shouldn't be any post in 'posts' that is not in 'posts_feed'
setdiff(posts, posts_feed)

#>                               id      from_id        from_name
#> 1 171378594202_10155054433869203 171378594202 Panvel Farmácias
#> 2 171378594202_10155052442979203 171378594202 Panvel Farmácias
#> 3 171378594202_10155049212349203 171378594202 Panvel Farmácias
#> 4 171378594202_10155042263259203 171378594202 Panvel Farmácias
#> 5 171378594202_10155042261309203 171378594202 Panvel Farmácias
#> 6 171378594202_10155033718669203 171378594202 Panvel Farmácias
#> 7 171378594202_10155033689299203 171378594202 Panvel Farmácias
#> 8 171378594202_10155003038309203 171378594202 Panvel Farmácias
pablobarbera commented 7 years ago

This seems to be an issue with the paging aspect in the API. When I recreate the API calls in these functions, I get the same results through the API explorer as through Rfacebook: page posts and page feed. But when I request the limit to be 100 instead of 25, I get more posts on the feed that were not there before: here.

I'm not sure how to deal with this. I was trying to identify if it's some posts in particular, but there doesn't seem to anything unique about the posts that are not returned by the API. I don't want to change the limit option in the API query, because more than 25 often leads to "too much data" errors. And a quick Google search for this issue with the general API did not return anything relevant.

So.... I'll leave this open for now while I think about it. Thanks for reporting this!