lkabuci / Botflix

🎥 Stream your favorite movie from the terminal!
MIT License
427 stars 26 forks source link

No results displayed, TopMoviesSpider.parse method is never called. #19

Closed aquib-sh closed 2 years ago

aquib-sh commented 2 years ago
2022-05-05 14:10:28 [scrapy.utils.log] INFO: Scrapy 2.6.1 started (bot: scrapybot)
2022-05-05 14:10:28 [scrapy.utils.log] INFO: Versions: lxml 4.8.0.0, libxml2 2.9.13, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 22.4.0, Python 3.10.4 (main, Mar 24 2022, 13:07:27) [GCC 11.2.0], pyOpenSSL 22.0.0 (OpenSSL 3.0.2 15 Mar 2022), cryptography 37.0.1, Platform Linux-5.17.0-1-amd64-x86_64-with-glibc2.33
2022-05-05 14:10:28 [scrapy.crawler] INFO: Overridden settings:
{}
2022-05-05 14:10:28 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.epollreactor.EPollReactor
2022-05-05 14:10:28 [scrapy.extensions.telnet] INFO: Telnet Password: 91c143582ecd5828
2022-05-05 14:10:28 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.logstats.LogStats']
2022-05-05 14:10:28 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2022-05-05 14:10:28 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2022-05-05 14:10:28 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2022-05-05 14:10:28 [scrapy.core.engine] INFO: Spider opened
2022-05-05 14:10:28 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2022-05-05 14:10:28 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2022-05-05 14:10:29 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://torrentgalaxy.to/> (failed 1 times): [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', '', 'unexpected eof while reading')]>]
2022-05-05 14:10:29 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://torrentgalaxy.to/> (failed 2 times): [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', '', 'unexpected eof while reading')]>]
2022-05-05 14:10:29 [scrapy.downloadermiddlewares.retry] ERROR: Gave up retrying <GET https://torrentgalaxy.to/> (failed 3 times): [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', '', 'unexpected eof while reading')]>]
2022-05-05 14:10:29 [scrapy.core.scraper] ERROR: Error downloading <GET https://torrentgalaxy.to/>
Traceback (most recent call last):
  File "/home/aquib/.local/lib/python3.10/site-packages/scrapy/core/downloader/middleware.py", line 49, in process_request
    return (yield download_func(request=request, spider=spider))
twisted.web._newclient.ResponseNeverReceived: [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', '', 'unexpected eof while reading')]>]
2022-05-05 14:10:29 [scrapy.core.engine] INFO: Closing spider (finished)
2022-05-05 14:10:29 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/exception_count': 3,
 'downloader/exception_type_count/twisted.web._newclient.ResponseNeverReceived': 3,
 'downloader/request_bytes': 648,
 'downloader/request_count': 3,
 'downloader/request_method_count/GET': 3,
 'elapsed_time_seconds': 0.881984,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2022, 5, 5, 8, 40, 29, 711889),
 'log_count/DEBUG': 3,
 'log_count/ERROR': 2,
 'log_count/INFO': 10,
 'memusage/max': 62857216,
 'memusage/startup': 62857216,
 'retry/count': 2,
 'retry/max_reached': 1,
 'retry/reason_count/twisted.web._newclient.ResponseNeverReceived': 2,
 'scheduler/dequeued': 3,
 'scheduler/dequeued/memory': 3,
 'scheduler/enqueued': 3,
 'scheduler/enqueued/memory': 3,
 'start_time': datetime.datetime(2022, 5, 5, 8, 40, 28, 829905)}
2022-05-05 14:10:29 [scrapy.core.engine] INFO: Spider closed (finished)
lkabuci commented 2 years ago

We should probably add a proxy to bypass countries' restrictions.

aquib-sh commented 2 years ago

In addition to above error, it also gives the following after the latest update of using python3 main.py top

Traceback (most recent call last):
  File "/usr/lib/python3.10/urllib/request.py", line 1348, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/usr/lib/python3.10/http/client.py", line 1282, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib/python3.10/http/client.py", line 1328, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.10/http/client.py", line 1277, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.10/http/client.py", line 1037, in _send_output
    self.send(msg)
  File "/usr/lib/python3.10/http/client.py", line 975, in send
    self.connect()
  File "/usr/lib/python3.10/http/client.py", line 1454, in connect
    self.sock = self._context.wrap_socket(self.sock,
  File "/usr/lib/python3.10/ssl.py", line 512, in wrap_socket
    return self.sslsocket_class._create(
  File "/usr/lib/python3.10/ssl.py", line 1070, in _create
    self.do_handshake()
  File "/usr/lib/python3.10/ssl.py", line 1341, in do_handshake
    self._sslobj.do_handshake()
ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/aquib/src/stream-cli/main.py", line 52, in <module>
    app()
  File "/home/aquib/.local/lib/python3.10/site-packages/typer/main.py", line 214, in __call__
    return get_command(self)(*args, **kwargs)
  File "/home/aquib/.local/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/aquib/.local/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/aquib/.local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/aquib/.local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/aquib/.local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/aquib/.local/lib/python3.10/site-packages/typer/main.py", line 500, in wrapper
    return callback(**use_params)  # type: ignore
  File "/home/aquib/src/stream-cli/main.py", line 19, in top
    apprun(TopMoviesSpider, True)
  File "/home/aquib/src/stream-cli/src/runner.py", line 36, in apprun
    movies = start_scrawling(scraping_class)
  File "/home/aquib/src/stream-cli/src/runner.py", line 26, in start_scrawling
    response = urllib.request.urlopen("https://www.torrentgalaxy.to/").getcode()
  File "/usr/lib/python3.10/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.10/urllib/request.py", line 519, in open
    response = self._open(req, data)
  File "/usr/lib/python3.10/urllib/request.py", line 536, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/usr/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.10/urllib/request.py", line 1391, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/usr/lib/python3.10/urllib/request.py", line 1351, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 104] Connection reset by peer>
aquib-sh commented 2 years ago

I understand this is something related to ISP blocking the requests from reaching

aquib-sh commented 2 years ago

would have to integrate a Tor connectivity so anyone in the world can use

lkabuci commented 2 years ago

would have to integrate a Tor connectivity so anyone in the world can use

I tried but it's not working, Tor request some sort of protocol?

$❯ curl http://galaxy3yrfbwlwo72q3v2wlyjinqr2vejgpkxb22ll5pcpuaxlnqjiid.onion/
curl: (6) Could not resolve host: galaxy3yrfbwlwo72q3v2wlyjinqr2vejgpkxb22ll5pcpuaxlnqjiid.onion
lkabuci commented 2 years ago

if tgx is not supported in your country you can use another torrent host website

aquib-sh commented 2 years ago

yes you need to be connected to onion protocol to access those url

aquib-sh commented 2 years ago

I will open up a new feature request (for Tor), will send a pull request if finished. Count me in to add the feature

lkabuci commented 2 years ago

I will open up a new feature request (for Tor), will send a pull request if finished. Count me in to add the feature

That will be a huge improvement!!