Closed dpiponi closed 13 years ago
Thanks for noticing this. It looks like you are indeed correct. I think this slipped through during a final pass of refactoring. Fortunately, it didn't make it into print (twitter__util.py isn't in print). I'll make a note to fix this in the next day or so. In the meanwhile, feel free to send me a pull request if you've already fixed it and made any other improvements. Thanks again.
Just getting around to cleaning house. Sorry for this delay. After taking a look at the issue more closely, I don't think there is actually a bug with lines 104-107 in twitter__utill.py after all. Lines 104-107 follow for convenience:
if sample < 1.0:
for lst in [screen_names, user_ids]:
shuffle(lst)
lst = lst[:int(len(lst) * sample)]
What's happening is that lst here is a reference to screen_names and user_ids and the shuffle() and assignment operations that take place in the body of the loop on lst are passed through to screen_names and user_ids. You can check this out for yourself in the interpreter with a similar test case to see that the changes to lst pass through via the references:
>>> a = []
>>> b = []
>>> for lst in [a,b]:
... lst.append('foo')
... lst = lst[:]
...
>>> a
['foo']
>>> b
['foo']
>>>
In twitter__util.py, line 107, the code
lst = lst[:int(len(lst) * sample)]
fails to trim the lists as intended because it's assigning the trimmed list tolst
, notscreen_names
oruser_ids
.