ChrisStevens / garc

Python library and command line tool for collecting JSON data from Gab.ai. Scrape posts, users and comments from "free-speech" social media platform Gab.
MIT License
35 stars 15 forks source link

Index Error: while using simple call "garc search maga" #9

Closed alpesh12345 closed 2 years ago

alpesh12345 commented 3 years ago

It was working last week.

Traceback (most recent call last):
/venv/bin/garc", line 33, in <module>
    sys.exit(load_entry_point('garc==2.0', 'console_scripts', 'garc')())
/venv/lib/python3.6/site-packages/garc/command.py", line 118, in main
    for thing in things:
/venv/lib/python3.6/site-packages/garc/client.py", line 66, in search
    resp = self.get(url)
/venv/lib/python3.6/site-packages/garc/client.py", line 261, in get
    self.login()
/venv/lib/python3.6/site-packages/garc/client.py", line 216, in login
    token = page_info.select('meta[name=csrf-token]')[0]['content']
IndexError: list index out of range
paulfurber commented 3 years ago

Looks like Gab is now returning a page without the csrf token when called by a script. It still works when you do it manually with a browser. I've fixed this by using the fake_useragent package but now API calls are failing for some other reason.

ChrisStevens commented 3 years ago

@paulfurber Is right, they have stopped returning results for requests with no user agent.

I've now added a default user agent of 'garc' to every call, which should make all of your API calls work currently.

I also added the ability to supply your own user-agent if you so wish using: garc user_agent

and following the prompts it gives.

This is not uploaded to pypi as yet, so will only work if installed from the repo here.

Please let me know if this isn't working for you or you're running into any other issue.

paulfurber commented 3 years ago

Thank you Chris - it's working now. It looks like the usercomments and userposts api calls also need the headers to be supplied. I've created a pull request with these changes.

ChrisStevens commented 3 years ago

@paulfurber are you having issues with those calls, they seem to be working to me. They shouldn't need the headers added where you did, they use the Garc.get function which itself grabs the header and supplies it when it makes the requests.get call, so it doesn't need to be given any headers as args. Does that make sense?

paulfurber commented 3 years ago

Ah, I see it's self.get and not requests.get. This is probably a version issue on my side - I've been fiddling with this all day and I think I was running old code from a virtualenv directory.

MaxSoar commented 3 years ago

Hi! I'm getting the same issue as alpesh12345 above when I try to use the "userposts" and "search" commands, even after updating to the latest commit. Any advice would be much appreciated, but I acknowledge I'm not the most experienced using such tools.

Traceback (most recent call last):
  File "C:\Dev\Python39\Scripts\garc-script.py", line 33, in <module>
    sys.exit(load_entry_point('garc==2.0', 'console_scripts', 'garc')())
  File "c:\dev\python39\lib\site-packages\garc\command.py", line 118, in main
    for thing in things:
  File "c:\dev\python39\lib\site-packages\garc\client.py", line 160, in userposts
    account_id = self.get(account_url).json()['id']
  File "c:\dev\python39\lib\site-packages\garc\client.py", line 261, in get
    self.login()
  File "c:\dev\python39\lib\site-packages\garc\client.py", line 216, in login
    token = page_info.select('meta[name=csrf-token]')[0]['content']
IndexError: list index out of range
alpesh12345 commented 3 years ago

Hi! I'm getting the same issue as alpesh12345 above when I try to use the "userposts" and "search" commands, even after updating to the latest commit. Any advice would be much appreciated, but I acknowledge I'm not the most experienced using such tools.

Traceback (most recent call last):
  File "C:\Dev\Python39\Scripts\garc-script.py", line 33, in <module>
    sys.exit(load_entry_point('garc==2.0', 'console_scripts', 'garc')())
  File "c:\dev\python39\lib\site-packages\garc\command.py", line 118, in main
    for thing in things:
  File "c:\dev\python39\lib\site-packages\garc\client.py", line 160, in userposts
    account_id = self.get(account_url).json()['id']
  File "c:\dev\python39\lib\site-packages\garc\client.py", line 261, in get
    self.login()
  File "c:\dev\python39\lib\site-packages\garc\client.py", line 216, in login
    token = page_info.select('meta[name=csrf-token]')[0]['content']
IndexError: list index out of range

You are using the newest release like: pip install git+git://github.com/ChrisStevens/garc.git? Its working for me this way

MaxSoar commented 3 years ago

I believed I was. But it seems after a good night's sleep and a complete reinstall it's back up and running. Thanks for your help.

alpesh12345 commented 2 years ago

closed