DocNow / twarc

A command line tool (and Python library) for archiving Twitter JSON
https://twarc-project.readthedocs.io
MIT License
1.37k stars 255 forks source link

list_members #258

Closed avtaylor closed 6 years ago

avtaylor commented 6 years ago

I have the following issue: 1) When I click to download the twarc zip I get a different set of file than what I see on the github directory 2) I wanted to test the list_members but it is not in the version that I download 3) Also on the readme I see that I can use "twarc listmembers https://twitter.com/edsu/lists/bots" this does not work when I install twarc either using pip or by strainght download of the archive then setup.

Please can one make it easier for new comers to join you?? This is a mad situation. Sorry to say this :)

avtaylor commented 6 years ago

ok So I found a issue. Again this is linked to the two versions python3 and python2. I created another environment this time one that runs Python3.5 and I installed twarc using pip And now the list_members works!

So this indicates to me that there are differences in terms of the two versions but point 1) from my previous issue above is still very puzzling.

hugovk commented 6 years ago

Where exactly did you download the zip from?

avtaylor commented 6 years ago

From the download button on the GitHub -Code tab

hugovk commented 6 years ago

The git cloned version and the one downloaded via "Download ZIP" from https://github.com/DocNow/twarc are identical, except the cloned one has a .git directory:

[hugo:/tmp] % git clone https://github.com/DocNow/twarc
Cloning into 'twarc'...
remote: Enumerating objects: 3063, done.
remote: Total 3063 (delta 0), reused 0 (delta 0), pack-reused 3063
Receiving objects: 100% (3063/3063), 785.79 KiB | 73.00 KiB/s, done.
Resolving deltas: 100% (1930/1930), done.
Checking out files: 100% (59/59), done.
⌂61% [hugo:/tmp] 1m6s % wget https://github.com/DocNow/twarc/archive/master.zip # same as "Download ZIP" from https://github.com/DocNow/twarc
--2018-11-03 13:22:49--  https://github.com/DocNow/twarc/archive/master.zip
Resolving github.com (github.com)... 192.30.253.112, 192.30.253.113
Connecting to github.com (github.com)|192.30.253.112|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://codeload.github.com/DocNow/twarc/zip/master [following]
--2018-11-03 13:22:52--  https://codeload.github.com/DocNow/twarc/zip/master
Resolving codeload.github.com (codeload.github.com)... 192.30.253.121, 192.30.253.120
Connecting to codeload.github.com (codeload.github.com)|192.30.253.121|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/zip]
Saving to: ‘master.zip’

master.zip                         [         <=>                                         ]  86.55K  30.0KB/s    in 2.9s

2018-11-03 13:22:57 (30.0 KB/s) - ‘master.zip’ saved [88630]

[hugo:/tmp] 18s % unzip master.zip
Archive:  master.zip
05c70ebfe9849d6bd0d0ceb6e594a52b5b260e48
   creating: twarc-master/
  inflating: twarc-master/.gitignore
  inflating: twarc-master/.travis.yml
  inflating: twarc-master/LICENSE
  inflating: twarc-master/MANIFEST.in
  inflating: twarc-master/README.md
  inflating: twarc-master/README_es_mx.md
  inflating: twarc-master/README_pt_br.md
  inflating: twarc-master/README_sv_se.md
  inflating: twarc-master/README_sw_ke.md
   creating: twarc-master/requirements/
 extracting: twarc-master/requirements/python2.txt
 extracting: twarc-master/requirements/python3.txt
  inflating: twarc-master/setup.cfg
  inflating: twarc-master/setup.py
  inflating: twarc-master/test_twarc.py
   creating: twarc-master/twarc/
  inflating: twarc-master/twarc/__init__.py
  inflating: twarc-master/twarc/client.py
  inflating: twarc-master/twarc/command.py
  inflating: twarc-master/twarc/decorators.py
  inflating: twarc-master/twarc/json2csv.py
   creating: twarc-master/utils/
  inflating: twarc-master/utils/deduplicate.py
  inflating: twarc-master/utils/deletes.py
  inflating: twarc-master/utils/discover_ids.py
  inflating: twarc-master/utils/embeds.py
  inflating: twarc-master/utils/emoji.py
  inflating: twarc-master/utils/extractor.py
  inflating: twarc-master/utils/filter_date.py
  inflating: twarc-master/utils/filter_users.py
  inflating: twarc-master/utils/foaf.py
  inflating: twarc-master/utils/gender.py
  inflating: twarc-master/utils/geo.py
  inflating: twarc-master/utils/geofilter.py
  inflating: twarc-master/utils/geojson.py
  inflating: twarc-master/utils/json2csv.py
  inflating: twarc-master/utils/media2warc.py
  inflating: twarc-master/utils/media_urls.py
  inflating: twarc-master/utils/network.py
  inflating: twarc-master/utils/noretweets.py
  inflating: twarc-master/utils/remove_limit.py
  inflating: twarc-master/utils/retweets.py
  inflating: twarc-master/utils/search.py
  inflating: twarc-master/utils/sensitive.py
  inflating: twarc-master/utils/sort_by_id.py
  inflating: twarc-master/utils/source.py
  inflating: twarc-master/utils/tags.py
  inflating: twarc-master/utils/times.py
  inflating: twarc-master/utils/twarc-archive.py
  inflating: twarc-master/utils/tweet.py
  inflating: twarc-master/utils/tweet_compliance.py
  inflating: twarc-master/utils/tweet_text.py
  inflating: twarc-master/utils/tweet_urls.py
  inflating: twarc-master/utils/tweetometer.py
  inflating: twarc-master/utils/tweets.py
  inflating: twarc-master/utils/unshrtn.py
  inflating: twarc-master/utils/urls.py
  inflating: twarc-master/utils/users.py
  inflating: twarc-master/utils/validate.py
  inflating: twarc-master/utils/wall.py
  inflating: twarc-master/utils/webarchives.py
  inflating: twarc-master/utils/wordcloud.py
[hugo:/tmp] 12s % diff -r twarc twarc-master
Only in twarc: .git
⌂78% [hugo:/tmp] 7s 1 % grep list_members -r twarc
twarc/test_twarc.py:    members = list(T.list_members(slug=slug, owner_screen_name=screen_name))
twarc/test_twarc.py:    members = list(T.list_members(slug=slug, owner_id=owner_id))
twarc/test_twarc.py:    members = list(T.list_members(list_id='197880909'))
twarc/twarc/client.py:    def list_members(self, list_id=None, slug=None, owner_screen_name=None, owner_id=None):
twarc/twarc/command.py:        things = t.list_members(slug=list_parts.group(2),
[hugo:/tmp] 5s % grep list_members -r twarc-master
twarc-master/test_twarc.py:    members = list(T.list_members(slug=slug, owner_screen_name=screen_name))
twarc-master/test_twarc.py:    members = list(T.list_members(slug=slug, owner_id=owner_id))
twarc-master/test_twarc.py:    members = list(T.list_members(list_id='197880909'))
twarc-master/twarc/client.py:    def list_members(self, list_id=None, slug=None, owner_screen_name=None, owner_id=None):
twarc-master/twarc/command.py:        things = t.list_members(slug=list_parts.group(2),
[hugo:/tmp] 2s %
hugovk commented 6 years ago

The easiest way for newcomers, and in fact for everyone, to use twarc is with pip install twarc

Or to make sure you have the latest version, pip install --upgrade twarc

https://github.com/DocNow/twarc#install

edsu commented 6 years ago

@avtaylor it does seem like you somehow downloaded an old version of the twarc. If you can retrace your steps to let us know how this happened so we can prevent it in the future that would be super helpful. Thanks @hugovk for investigating the problem.

avtaylor commented 6 years ago

Hi @edsu Ok I think I solved the mistery :) So sorry for the mad approach...just felt frustrated as I could not see what was happening.

Now I traced my steps and I did downloaded my first twarc by clicking on the download link on https://github.com/edsu/twarc This link appears in some tutorials online such as https://shawngraham.gitbooks.io/dh-workbook/content/docs/supporting%20materials/twarc.html

Of course after that somehow I went to this more current repository and did not even notice that I was looking at different versions. As the older githubpage I only used once.

Again apologies for the wild statements in the issue I raised...

hugovk commented 6 years ago

Thanks for sharing the link to the tutorial! I can now see what happened. The link there points to the old, main repo at https://github.com/edsu/twarc

But now the main repo has moved here to https://github.com/DocNow/twarc and the old one is out of date.

One thing we could do right now, is for @edsu to update his fork to match this main repo. That's probably a bit boring and won't be much fun to do all the time.

Alternatively, perhaps the master branch at edsu/twarc could be a redirect of notice of some sort pointing to the new home. That way he can still use his fork for feature branches and so on, and people going to the old home won't get 404s.

I've created a PR in what I think is the right place to fix the links in that tutorial: https://github.com/shawngraham/hist3907o/pull/5

If you know any others still pointing to the old one, please ask the maintainers to update their links, or let us know here.

We could possibly go hunting for old links as well, or @edsu can check https://github.com/edsu/twarc/graphs/traffic to see which sites are linking to the old home.

Thanks!

edsu commented 6 years ago

I went ahead and deleted edsu/twarc and (thankfully) GitHub redirect requests for it to docnow/twarc.

@avtaylor thanks for your patience in figuring out what went wrong -- and @hugovk as always, your help maintaining this project is greatly appreciated. Some day I need to buy you a :beer: or a :coffee:.