rachmadaniHaryono / we-get

:icecream: Command-line tool for searching torrents.
MIT License
153 stars 26 forks source link

urlopen error [Errno -2] Name or service not known #19

Closed L04DB4L4NC3R closed 4 years ago

L04DB4L4NC3R commented 4 years ago

Describe the bug The torrents fail to fetch.

To Reproduce

  1. Install the package:

    $ sudo pip3 install git+https://github.com/rachmadaniHaryono/we-get
  2. Run the following commands:

we-get --search "Arch Linux" --target the_pirate_bay

Expected behavior Expected to see a list of torrents

Screenshots

┌─[root@l04db4l4nc3r] - [~] - [Sat Apr 18, 14:42]
└─[$] <> we-get --search "Arch Linux" --target the_pirate_bay
Traceback (most recent call last):ay' ...
  File "/usr/lib/python3.8/urllib/request.py", line 1319, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/usr/lib/python3.8/http/client.py", line 1230, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1276, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1225, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1004, in _send_output
    self.send(msg)
  File "/usr/lib/python3.8/http/client.py", line 944, in send
    self.connect()
  File "/usr/lib/python3.8/http/client.py", line 1392, in connect
    super().connect()
  File "/usr/lib/python3.8/http/client.py", line 915, in connect
    self.sock = self._create_connection(
  File "/usr/lib/python3.8/socket.py", line 787, in create_connection
    for res in getaddrinfo(host, port, 0, SOCK_STREAM):
  File "/usr/lib/python3.8/socket.py", line 918, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/we-get", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/dist-packages/we_get/__init__.py", line 14, in main
    we_get.start()
  File "/usr/local/lib/python3.8/dist-packages/we_get/core/we_get.py", line 237, in start
    sel.run()
  File "/usr/local/lib/python3.8/dist-packages/we_get/core/we_get.py", line 158, in run
    items = run.main(self.pargs)
  File "/usr/local/lib/python3.8/dist-packages/we_get/modules/the_pirate_bay.py", line 78, in main
    return run.search()
  File "/usr/local/lib/python3.8/dist-packages/we_get/modules/the_pirate_bay.py", line 62, in search
    data = self.module.http_get_request(url)
  File "/usr/local/lib/python3.8/dist-packages/we_get/core/module.py", line 24, in http_get_request
    return opener.open(url).read().decode()
  File "/usr/lib/python3.8/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/usr/lib/python3.8/urllib/request.py", line 542, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 1362, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/usr/lib/python3.8/urllib/request.py", line 1322, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno -2] Name or service not known>

Desktop (please complete the following information):

Additional context Note that all of the dependancies and pre-requisites for this utility have been installed already:

┌─[root@l04db4l4nc3r] - [~] - [Sat Apr 18, 14:46]
└─[$] <> pip3 install prompt_toolkit docopt colorama         
Requirement already satisfied: prompt_toolkit in /usr/lib/python3/dist-packages (2.0.10)
Requirement already satisfied: docopt in /usr/local/lib/python3.8/dist-packages (0.6.2)
Requirement already satisfied: colorama in /usr/lib/python3/dist-packages (0.4.3)
rachmadaniHaryono commented 4 years ago

the original pirate bay is changing and the page will load with javascript. it mean the program need a better way to download that page (something like selenium or similar). a new parser and url format is also needed.

the quickfix i can think of is to use piratebay mirror with old layout. i'm creating a pr here https://github.com/rachmadaniHaryono/we-get/tree/feature/print-url.

if you can test it and it is good enough, i will merge it to master

in the end, there should be a config file so user can change the url format when things gone wrong with the website

Hendrikto commented 4 years ago

requests-html is probably suitable for this.

rachmadaniHaryono commented 4 years ago

i will consider that. it will mean change a lot of part on the program and parser as well. maybe can be added for the next big release

e: and TIL request-html just support python 3.6. if it added to program i will have to consider user with python version too for example the op which have python3.8

L04DB4L4NC3R commented 4 years ago

the original pirate bay is changing and the page will load with javascript. it mean the program need a better way to download that page (something like selenium or similar). a new parser and url format is also needed.

the quickfix i can think of is to use piratebay mirror with old layout. i'm creating a pr here https://github.com/rachmadaniHaryono/we-get/tree/feature/print-url.

if you can test it and it is good enough, i will merge it to master

in the end, there should be a config file so user can change the url format when things gone wrong with the website

It is working with the TPB mirror in your PR. Although same issue persists with 1337x. Also, control characters aren't parsed properly, so the following line throws an error with 1337x:

image

Initially I thought it might be due to bad urlencoding, but it works fine with the piratebay proxy. If a user consciously avoids control characters (space in this case) then it can be avoided, but it is not good UX.

image

rachmadaniHaryono commented 4 years ago

i added fix for 1337x module.

little comment about difference result:

in TPB module space will be converted to - character. in 1337x module on the same branch as above i will assume that the search term will be converted using python3 quote_plus https://docs.python.org/3/library/urllib.parse.html#urllib.parse.quote_plus

both is not tested with unusual character etc and maybe there is still similar bug on other module as well

rachmadaniHaryono commented 4 years ago

the branch related to this issue is merged to master branch

if you find another bug, please create another issue

thank you