DedSecInside / TorBot

Dark Web OSINT Tool
Other
2.73k stars 509 forks source link

Add HTML feature #314

Open KingAkeem opened 8 months ago

KingAkeem commented 8 months ago

The HTML feature hasn't been implemented yet.

The feature will operate on a single URL, the user should be able to pass a flag (--html) to output the HTML of that specific webpage. If they also pass the (--save) flag, then the HTML should be saved to a .html file.

Soham-Thodge commented 8 months ago

I would like to try to solve this issue @KingAkeem

KingAkeem commented 8 months ago

Just assigned it to you @kronos2003, it's all yours. Let me know if you need any help.

Soham-Thodge commented 8 months ago

I am a bit new so i've made some changes to the file, can you let me know how can i run the tool to ensure the function works as intended?

KingAkeem commented 8 months ago

We use poetry to manage dependencies so you'll need to install it first. https://python-poetry.org/docs/#installing-with-the-official-installer

If you're already familiar with Python virtual environments, the requirements.txt file is also up to date so you could create a virtual env, then install those using pip install -r requirements.txt, either option will work.

Once that's done, follow these examples: https://github.com/DedSecInside/TorBot?tab=readme-ov-file#installation

The main file is named __main__ meaning the application can be ran using the directory without needing to include the file name as well.

pavankalyan224847 commented 8 months ago

i would like to take up this issue since it is my first open source contribution please guide me

PSNAppz commented 8 months ago

@pavankalyan224847 This is currently assigned to @kronos2003. Can you see other open issues?

pavankalyan224847 commented 8 months ago

yea no problem i'll look into other issues

Soham-Thodge commented 8 months ago

@KingAkeem i managed to get the dependencies installed in a venv and i tried to run the updated main.py but it's showing a getaddrinfo() error

KingAkeem commented 8 months ago

Which version of Python are you using, can you post the error and what command were you running? Try to give as much detail as possible.

Soham-Thodge commented 8 months ago

I'll post the error in some hours as I've just logged off For the python version im using 3.11.4

KingAkeem commented 8 months ago

OK won't be able to do much until you post the error since it sounds like a configuration issue on your machine. Some things you can check yourself is

  1. Do you have Tor running? If not you can use the --disable-socks5 flag to run without it.
  2. Do you have it configured correctly? Check the .env for the expected host and port.
Soham-Thodge commented 8 months ago

(HTML) C:\Users\pc\TorBot\torbot>python main.py -u http://torlinksge6enmcyyuxjpjkoouw4oorgdgeo7ftnq3zodj7g2zxi3kyd.onion/ Traceback (most recent call last): File "C:\Users\pc\TorBot\HTML\Lib\site-packages\httpx_urlparse.py", line 348, in normalize_port port_as_int = int(port) ^^^^^^^^^ ValueError: invalid literal for int() with base 10: 'None'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\Users\pc\TorBot\torbot__main.py", line 158, in run(arg_parser, version) File "C:\Users\pc\TorBot\torbot__main__.py", line 89, in run with httpx.Client(timeout=60, proxies=socks5_proxy if not args.disable_socks5 else None) as client: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\pc\TorBot\HTML\Lib\site-packages\httpx_client.py", line 670, in init proxy_map = self._get_proxy_map(proxies, allow_env_proxies) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\pc\TorBot\HTML\Lib\site-packages\httpx_client.py", line 228, in _get_proxy_map proxy = Proxy(url=proxies) if isinstance(proxies, (str, URL)) else proxies ^^^^^^^^^^^^^^^^^^ File "C:\Users\pc\TorBot\HTML\Lib\site-packages\httpx_config.py", line 333, in init url = URL(url) ^^^^^^^^ File "C:\Users\pc\TorBot\HTML\Lib\site-packages\httpx_urls.py", line 113, in init__ self._uri_reference = urlparse(url, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\pc\TorBot\HTML\Lib\site-packages\httpx_urlparse.py", line 246, in urlparse parsed_port: typing.Optional[int] = normalize_port(port, scheme) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\pc\TorBot\HTML\Lib\site-packages\httpx_urlparse.py", line 350, in normalize_port raise InvalidURL(f"Invalid port: {port!r}") httpx.InvalidURL: Invalid port: 'None'

KingAkeem commented 8 months ago

Are you using the latest version of dev? There is no main.py currently.

Soham-Thodge commented 8 months ago

I started working on the latest build forking it just 3 days prior so im not sure regarding the non-existence of main.py

KingAkeem commented 8 months ago

Are you running the command from the root directory or from within the torbot directory. It's possible that the .env file cannot be found.

Try running the program from the root directory based on the example in the README.

Soham-Thodge commented 8 months ago

I'm running the program from the torbot directory where the main.py file is located

KingAkeem commented 8 months ago

Try running the program from the root directory using torbot/main.py, the address information cannot be found from the .env which is in the root directory. There's a ticket to switch the environment variables to CLI flags which will resolve this issue. But for now, you'll need to run it from the root directory.

KingAkeem commented 8 months ago

Any updates?

Soham-Thodge commented 8 months ago

I've been going on,but it will take some time to find and root out the exact error in the updated files

KingAkeem commented 8 months ago

I've updated the program to use CLI flags instead of the .env file for the SOCKS configuration. If you update your branch with the latest version, it may resolve your issue.