alecxe / scrapy-fake-useragent

Random User-Agent middleware based on fake-useragent
MIT License
687 stars 98 forks source link

FakerProvider not working #36

Open axiangcoding opened 2 years ago

axiangcoding commented 2 years ago

Similar to #30 , but i use latest version of scrapy-fake-useragent 1.4.4

here is my setting.py :

DOWNLOADER_MIDDLEWARES = {
    'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware': None,
    'scrapy.downloadermiddlewares.retry.RetryMiddleware': None,
    'scrapy_fake_useragent.middleware.RandomUserAgentMiddleware': 400,
    'scrapy_fake_useragent.middleware.RetryUserAgentMiddleware': 401,
}

# When i comment FAKEUSERAGENT_PROVIDERS out, all seems works fine
FAKEUSERAGENT_PROVIDERS = [
     "scrapy_fake_useragent.providers.FakerProvider",
     "scrapy_fake_useragent.providers.FakeUserAgentProvider",
     "scrapy_fake_useragent.providers.FixedUserAgentProvider",
]

Using :

python 3.10.0 Scrapy 2.5.1 faker 11.1.0 scrapy-fake-useragent 1.4.4

axiangcoding commented 2 years ago

here is my console log:

2022-01-05 09:55:45 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.feedexport.FeedExporter',
 'scrapy.extensions.logstats.LogStats']
2022-01-05 09:55:45 [scrapy_fake_useragent.middleware] INFO: Error loading User-Agent provider: ['scrapy_fake_useragent.providers.FakerProvider', 'scrapy_fake_useragent.providers.FakeUserAgentProvider', 'scrapy_fake_useragent.pr
oviders.FixedUserAgentProvider']
2022-01-05 09:55:45 [scrapy_fake_useragent.middleware] INFO: Unable to load any of the User-Agent providers
2022-01-05 09:55:45 [faker.factory] DEBUG: Not in REPL -> leaving logger event level as is.
2022-01-05 09:55:45 [scrapy_fake_useragent.middleware] INFO: Using '<class 'scrapy_fake_useragent.providers.FixedUserAgentProvider'>' as the User-Agent provider
2022-01-05 09:55:45 [scrapy_fake_useragent.middleware] INFO: Error loading User-Agent provider: ['scrapy_fake_useragent.providers.FakerProvider', 'scrapy_fake_useragent.providers.FakeUserAgentProvider', 'scrapy_fake_useragent.pr
oviders.FixedUserAgentProvider']
2022-01-05 09:55:45 [scrapy_fake_useragent.middleware] INFO: Unable to load any of the User-Agent providers
2022-01-05 09:55:45 [scrapy_fake_useragent.middleware] INFO: Using '<class 'scrapy_fake_useragent.providers.FixedUserAgentProvider'>' as the User-Agent provider
2022-01-05 09:55:45 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy_fake_useragent.middleware.RandomUserAgentMiddleware',
 'scrapy_fake_useragent.middleware.RetryUserAgentMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2022-01-05 09:55:45 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2022-01-05 09:55:45 [scrapy.middleware] INFO: Enabled item pipelines:
['crawler.pipelines.CrawlerPipeline']
devfox-se commented 1 year ago

I was struggling with this as well Read the docs: https://docs.scrapy.org/en/latest/topics/settings.html#downloader-middlewares-base

scrapy-fake-useragent has outdated order of how you should setup your DOWNLOADER_MIDDLEWARES

scrapy.downloadermiddlewares.useragent.UserAgentMiddleware now ordered under 500 so you should update your order to not consume the 400 as you have it now

DOWNLOADER_MIDDLEWARES = {
    "scrapy.downloadermiddlewares.useragent.UserAgentMiddleware": None,
    "scrapy.downloadermiddlewares.retry.RetryMiddleware": None,
    "scrapy_fake_useragent.middleware.RandomUserAgentMiddleware": 500,
    "scrapy_fake_useragent.middleware.RetryUserAgentMiddleware": 501,
}
JoshuaSeth commented 2 months ago
  DOWNLOADER_MIDDLEWARES = {
    "scrapy.downloadermiddlewares.useragent.UserAgentMiddleware": None,
    "scrapy.downloadermiddlewares.retry.RetryMiddleware": None,
    "scrapy_fake_useragent.middleware.RandomUserAgentMiddleware": 500,
    "scrapy_fake_useragent.middleware.RetryUserAgentMiddleware": 501,
}

That did not solve it for me sadly. I have this problem with:

The problem stays when I use the newest version of these packages and python3.11.

Am I missing something?