shaikhsajid1111 / twitter-scraper-selenium

Python's package to scrap Twitter's front-end easily
https://pypi.org/project/twitter-scraper-selenium
MIT License
299 stars 46 forks source link

Install help: selenium.common.exceptions.SessionNotCreatedException #67

Closed liminalpepe closed 1 year ago

liminalpepe commented 1 year ago

Hi there team, Thanks for this amazing lib, used it a few times already and had no problem but today trying to make it work again and got stuck on a gecko driver error.

...
selenium.common.exceptions.SessionNotCreatedException: Message: Expected browser binary location, but unable to find binary in default location, no 'moz:firefoxOptions.binary' capability provided, and no binary flag set on the command line
...

I created a docker to have a fresh install and make sure it's not my setup, I use it on a Macos and have the same error

Here are the files and how to reproduce

Dockerfile

# Use the official Python 3.9.16 image as the base image
FROM python:3.9.16

# Set the working directory to /app
WORKDIR /app

# Install the necessary dependencies
RUN pip install twitter-scraper-selenium

# Set up the shared volume
VOLUME ["/app"]

# Set the default command to run your script
CMD [ "python", "scrapper.py" ]

scrapper.py

from twitter_scraper_selenium import scrape_profile

microsoft = scrape_profile(twitter_username="microsoft",output_format="json",browser="firefox",tweets_count=10)
print(microsoft)

To run, after having docker installed and setup do:

docker build -t twitter-scraper .
docker run -v $(pwd):/app twitter-scraper

The bellow error happens on docker run and couldn't find anything useful on the internet to help me fix. Can you help me understand what is happening? It's seems to be with the geckodriver, not exactly with the twitter-scapper-selenium but I'm not sure where else to look

Full logs bellow

[WDM] - There is no [linux64] geckodriver for browser  in cache
[WDM] - Getting latest mozilla release info for v0.33.0
[WDM] - Trying to download new driver from https://github.com/mozilla/geckodriver/releases/download/v0.33.0/geckodriver-v0.33.0-linux64.tar.gz
[WDM] - Driver has been saved in cache [/root/.wdm/drivers/geckodriver/linux64/v0.33.0]
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/twitter_scraper_selenium/profile.py", line 118, in scrap
    self.__start_driver()
  File "/usr/local/lib/python3.9/site-packages/twitter_scraper_selenium/profile.py", line 39, in __start_driver
    self.__driver = Initializer(
  File "/usr/local/lib/python3.9/site-packages/twitter_scraper_selenium/driver_initialization.py", line 104, in init
    driver = self.set_driver_for_browser(self.browser_name)
  File "/usr/local/lib/python3.9/site-packages/twitter_scraper_selenium/driver_initialization.py", line 97, in set_driver_for_browser
    return webdriver.Firefox(service=FirefoxService(executable_path=GeckoDriverManager().install()), options=self.set_properties(browser_option))
  File "/usr/local/lib/python3.9/site-packages/seleniumwire/webdriver.py", line 179, in __init__
    super().__init__(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/firefox/webdriver.py", line 197, in __init__
    super().__init__(command_executor=executor, options=options, keep_alive=True)
  File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 288, in __init__
    self.start_session(capabilities, browser_profile)
  File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 381, in start_session
    response = self.execute(Command.NEW_SESSION, parameters)
  File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 444, in execute
    self.error_handler.check_response(response)
  File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/errorhandler.py", line 249, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.SessionNotCreatedException: Message: Expected browser binary location, but unable to find binary in default location, no 'moz:firefoxOptions.binary' capability provided, and no binary flag set on the command line

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/scrapper.py", line 21, in <module>
    microsoft = scrape_profile(twitter_username="microsoft",output_format="json",browser="firefox",tweets_count=10)
  File "/usr/local/lib/python3.9/site-packages/twitter_scraper_selenium/profile.py", line 197, in scrape_profile
    data = profile_bot.scrap()
  File "/usr/local/lib/python3.9/site-packages/twitter_scraper_selenium/profile.py", line 128, in scrap
    self.__close_driver()
  File "/usr/local/lib/python3.9/site-packages/twitter_scraper_selenium/profile.py", line 43, in __close_driver
    self.__driver.close()
AttributeError: 'str' object has no attribute 'close'

Appreciate any help

liminalpepe commented 1 year ago

Never mind, forgot to install chrome or firefox (default) LOL

here is the updated Dockerfile for chrome in case anybody else need

# Use the official Python 3.9.16 image as the base image
FROM python:3.9.16

# Install Chrome and necessary dependencies
RUN apt-get update && apt-get install -y wget gnupg2
RUN wget -q -O - https://dl.google.com/linux/linux_signing_key.pub | apt-key add -
RUN echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list
RUN apt-get update && apt-get install -y google-chrome-stable

# Install the necessary Python libraries
RUN pip install twitter-scraper-selenium

# Set up the shared volume
VOLUME ["/app"]

# Set the default command to run your script
CMD [ "python", "scrapper.py" ]

Thanks!