code-google-com / python-twitter

Automatically exported from code.google.com/p/python-twitter
Apache License 2.0
0 stars 0 forks source link

paging not working getfollowers(page=?) perhaps next_cursor is required now #113

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
api.GetFollowers(page=1)

produces same result as 

api.GetFollowers(page=2)

I read in the api wiki that page is going to be deprecated in favor of
next_cursor. I think this may have happened already because I cannot get
paging to work. Thoughts anyone?

The version I am using of python-twitter was copied today via mercurial.

Original issue reported on code.google.com by merrick@gmail.com on 18 Dec 2009 at 8:32

GoogleCodeExporter commented 9 years ago
I've attached a patch for this.

The problem with it is in the way that twitter have implemented the new change. 
If
you provide no cursor parameter, the API returns a list of users, just like it 
does
currently. In order to use the cursors to iterate through the list, you must 
make
your initial call with cursor=-1. The JSON call then turns this into a 
dictionary
like: {"next_cursor":<number>, "previous_cursor":<number>, "users":[list of 
users]}.

A short example to fetch all of a user's followers.

    cursor = -1
    followers = []

    while cursor != 0:
        ret = twitter_api.GetFollowers(cursor=cursor)
        followers += ret["users"]
        cursor = ret["next_cursor"]

Original comment by eoghan.gaffney on 18 Dec 2009 at 12:35

Attachments:

GoogleCodeExporter commented 9 years ago
This is a better version of the patch that uses iterators for GetFriends() and
GetFollowers().

    followers = twitter_api.GetFollowers()
    print list(f.id for f in followers)

Original comment by sfl...@gmail.com on 5 Jan 2010 at 11:05

Attachments:

GoogleCodeExporter commented 9 years ago
Issue 120 has been merged into this issue.

Original comment by bear42 on 9 Feb 2010 at 6:07

GoogleCodeExporter commented 9 years ago
Issue 143 has been merged into this issue.

Original comment by bear42 on 13 Jun 2010 at 11:42

GoogleCodeExporter commented 9 years ago
For a first pass I'm changing the parameter to "cursor" so that this will now 
function the same but with the new param.  I'm hesitant to make the change to 
an iterator because that is a functional change to methods behaviour.

After all this OAuth induced panic passes I'll move on to working this into a 
proper cursor patch.

Original comment by bear42 on 13 Jun 2010 at 11:58

GoogleCodeExporter commented 9 years ago
I have a few suggestions here. Could you clarify what solution you suggest when 
you hint at a "proper cursor patch"? Just changing the page parameter to 
cursor, or another, more complex change?

On a more generic point, I think that it would make more sense to convert the 
methods to generators. It doesnt make much sense to have our users handle that 
"cursor" abstraction. 

If however this is not a possibility, even including this in a major version 
change, then I would advocate for two methods, twitter.iterCursor and 
twitter.iterPage, that could wrap respectively cursor- and page-based methods, 
to turn them into generators. Obviously, those methods could have handy 
parameters such as itemlimit= and pagelimit=, or you-name-it (one limit for the 
number of requests, another for the limit of items returned. Result is the 
minimum of the two parameters)

While implementation for page-based method does not sound difficult, 
implementation for cursors is not straightforward, because you need to tweak 
existing methods so they include request metadata in returned results.
For example, instead of returning [User.NewFromJsonDict(x) for x in data], one 
would return CursorResult(User.NewFromJsonDict(x) for x in data, cursor) where 
CursorResult is a transparent list-like object, containing a _cursor attribute.

I know. When I think at the complexity of the required changes if we want to 
ensure backwards-compatibility, my Python zen screams "make it simple, turn the 
method into a generator".

Original comment by nicd...@gmail.com on 14 Jun 2010 at 12:29

GoogleCodeExporter commented 9 years ago
Yea, I guess that was very vague - but I was trying to avoid opening a 
discussion about what "proper" would be here in the Issues and wanted it to be 
on the discussion list.

My main goal for this is to fix some low-hanging bugs, like the 4 changes you 
posted and some others, and then on top of that add OAuth.

*Then* we can work on what would be better methods for handling the new 
complexities that using cursors requires :)

Original comment by bear42 on 14 Jun 2010 at 12:32

GoogleCodeExporter commented 9 years ago
I'm trying to apply this patch but keep getting errors.  Is there a preferred 
patch I should use to obtain the next_cursor in order to pass it back to the 
api.GetFollowers() function?

Thanks.

Original comment by dpao...@gmail.com on 26 Jul 2010 at 9:32

GoogleCodeExporter commented 9 years ago
I patched it manually, I wasn't sure which branch to add the patch to.  *gah*

Original comment by dpao...@gmail.com on 26 Jul 2010 at 10:24

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
I am still unable to get the new cursor pagination system to work.

I have the following, as per above:

        cursor = -1
        friends = []
        while cursor != 0:
            ret = api.GetFriends(cursor=cursor)
            friends += ret["users"]
            cursor = ret["next_cursor"]

And when I run it, I get:
TypeError: list indices must be integers, not str

I would love a simple example of how to get a list of all friends/followers 

[using 0.9-devel]

Original comment by gordonbonnar on 2 Nov 2010 at 7:52

GoogleCodeExporter commented 9 years ago
"I would love a simple example of how to get a list of all friends/followers"

- That's what I do (in my code):

    followers = []
    cursor = -1
    while cursor != 0:
        ret = api.GetFollowers(cursor=cursor)
        #followers += ret["users"]
        [followers.append(twitter.User.NewFromJsonDict(x)) for x in ret["users"]]
        cursor = ret["next_cursor"]

- And these are the changes done in twitter.py (after doing an hg clone, so I'm 
using version 0.9-devel). The "patches" I've applied are:

line:2666  def GetFollowers(self, page=None, cursor=-1): #ocelma: added param 
cursor
line 2685     parameters['cursor'] = cursor #ocelma
line 2690:     return data #ocelma: just return the JSON, do no create a list 
of User

Original comment by groovif...@gmail.com on 14 Nov 2010 at 5:32

GoogleCodeExporter commented 9 years ago
Not working?

Sample with my user following 177 people:

users = api.GetFriends()
len(users) # Returns 100
users = api.GetFriends(cursor=0)
len(users) # Returns 0

I don't see any next_cursor in ret, just a list of twitter.User.

Thanks

Original comment by sebastia...@gmail.com on 7 Jan 2011 at 3:57

GoogleCodeExporter commented 9 years ago
Does anyone have a working patch for this? It's shocking to me how long this 
issue has been around and not resolved.

Original comment by cho...@gmail.com on 24 Feb 2011 at 4:52

GoogleCodeExporter commented 9 years ago
The patches above work, but haven't been integrated. I'm not sure why.

If someone passes in cursor then surely they know enough to want back the 
properly formatted response, including next/previous cursor.

Original comment by andyh...@gmail.com on 9 Apr 2011 at 2:07

GoogleCodeExporter commented 9 years ago
I applied sfl...@gmail.com's patch by hand to the GetFriends() and 
GetFollowers() methods, and both work excellently. I'm surprised the code-base 
hasn't been updated with this fix - it's straight-forward.

Someone with some familiarity with the code should go in and patch each method 
that uses the broken technique (GetFriends(), GetFriendsIds(), GetFollowers(), 
etc.).

Original comment by dwee...@gmail.com on 12 Jan 2012 at 8:37

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
I wholeheartedly agree. I've been thinking about how to do this for a while. 
I'm new to python, but I'd love to help out and I'd really like this to get 
fixed because I, in particular, would like to use it. :) I know it would 
probably take someone longer to mentor/coach me on how to actually fix this, 
but I'd be grateful.

In particular, I'm trying to get followers for a celebrity (authenticated) with 
250K followers.

Question about the code, given above:
    followers = []
    cursor = -1
    while cursor != 0:
        ret = api.GetFollowers(cursor=cursor)
        #followers += ret["users"]
        [followers.append(twitter.User.NewFromJsonDict(x)) for x in ret["users"]]
        cursor = ret["next_cursor"]

Will this code rest appropriately so that I don't run out of Api calls? 250,000 
followers means that I'll need to make 50 calls. This shouldn't be an issue, 
and yet, evidently, I get notified that I've exceeded my rate limit.

Am I better off writing to disk during each iteration of the while loop so as 
to not store this all in memory?

Original comment by gabe.gas...@gmail.com on 31 Jan 2012 at 6:42

GoogleCodeExporter commented 9 years ago
I don't think the code will rest automatically. You can always time.sleep(x) 
between iterations, although writing it to disk is probably smart so a time-out 
won't revert your progress.

I'm getting closer to patching this code myself, although it will be my first 
time contributing to code on Google Code.

Original comment by dwee...@gmail.com on 4 Feb 2012 at 12:15

GoogleCodeExporter commented 9 years ago
I made minor changes in "GetFriends" and "GetFollowers" in twitter.py and the 
example code, below, is working properly now: 

################
cursor = -1
followers = []
while cursor != 0:
    ret = twitter_api.GetFollowers(cursor=cursor)
    followers += ret["users"]
    cursor = ret["next_cursor"]
###############

Changes:

def GetFriends(self, user=None, cursor=-1):
.
.
.
last line -->>    
# return [User.NewFromJsonDict(x) for x in data['users']]
return data

Def GetFollowers was completely changed as below:

  def GetFollowers(self, cursor=-1):
    '''Fetch the sequence of twitter.User instances, one for each follower

    The twitter.Api instance must be authenticated.

    Args:
      page:
        Specifies the page of results to retrieve.
        Note: there are pagination limits. [Optional]

    Returns:
      A sequence of twitter.User instances, one for each follower
    '''
    if not self._oauth_consumer:
      raise TwitterError("twitter.Api instance must be authenticated")
    url = '%s/statuses/followers.json' % self.base_url

    parameters = {}
    parameters['cursor'] = cursor
    json = self._FetchUrl(url, parameters=parameters)
    data = self._ParseAndCheckTwitter(json)
    return data

It's worth some update the code-base though!  

Original comment by mehdy....@gmail.com on 20 Nov 2012 at 9:18

GoogleCodeExporter commented 9 years ago
Hi python-twitter community,

api.GetFollowers() works without an error (rate limiting, code 88) whereas 
api.GetFriends() throws error.

I am using the example code from the thread.
-------
cursor = -1
followers = []
while cursor != 0:
    ret = api.GetFriends(screen_name = 'username', cursor=cursor)
    followers += ret["users"]
    cursor = ret["next_cursor"]
-------
Could anyone please confirm i am not doing anything wrong.

Original comment by viv...@gmail.com on 1 Sep 2014 at 6:14