mikwielgus / forum-dl

Scrape posts, threads from forums, news aggregators, mail archives, export to JSONL, mailbox, WARC
MIT License
68 stars 2 forks source link

The term 'forum-dl' is not recognized #15

Closed abhiiously closed 12 months ago

abhiiously commented 12 months ago

Hello! Any time I run forum-dl "https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/"

in powershell, it gives me this error.

The term 'forum-dl' is not recognized as the name of a cmdlet, function, script file, or operable program.
Check the spelling of the name, or if a path was included, verify that the path is correct and try again.
At line:1 char:1
+ forum-dl "https://www.kanyetothe.com/threads/whats-on-your-mind-also- ...
+ ~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (forum-dl:String) [], CommandNotFoundException
    + FullyQualifiedErrorId : CommandNotFoundException

Any idea what I am doing wrong? For reference, I installed this via "pip install forum-dl" which worked, but when I run the forum-dl command from that folder it gives me the error.

abhiiously commented 12 months ago

If I run the command using forumdl.py this is the error I get

PS C:\Users\Abhi\Desktop\forum-dl-develop\forum_dl> python .\forumdl.py "https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/"

Traceback (most recent call last):
  File "C:\Users\Abhi\Desktop\forum-dl-develop\forum_dl\forumdl.py", line 7, in <module>
    from . import extractors
ImportError: attempted relative import with no known parent package
mikwielgus commented 12 months ago

It may be a Powershell-specific problem. Does invoking

python3 -m forum_dl "https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/"

instead make any difference?

abhiiously commented 12 months ago

It may be a Powershell-specific problem. Does invoking

python3 -m forum_dl "https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/"

instead make any difference?

I just tired and it said python3 was not found.

I then tried without python3 and just did python and it gave this error


PS C:\Users\Abhi\Desktop\forum-dl-develop> python -m forum_dl "https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/"
Traceback (most recent call last):
  File "C:\Program Files\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\Abhi\Desktop\forum-dl-develop\forum_dl\__main__.py", line 5, in <module>
    sys.exit(forum_dl.main())
  File "C:\Users\Abhi\Desktop\forum-dl-develop\forum_dl\__init__.py", line 34, in main
    forumdl.download(
  File "C:\Users\Abhi\Desktop\forum-dl-develop\forum_dl\forumdl.py", line 24, in download
    self.download_url(
  File "C:\Users\Abhi\Desktop\forum-dl-develop\forum_dl\forumdl.py", line 40, in download_url
    extractor = extractors.find(url, session_options, extractor_options)
  File "C:\Users\Abhi\Desktop\forum-dl-develop\forum_dl\extractors\__init__.py", line 32, in find
    for cls in list_classes():
  File "C:\Users\Abhi\Desktop\forum-dl-develop\forum_dl\extractors\__init__.py", line 44, in list_classes
    module = __import__(module_name, globals_, None, (), 1)
  File "C:\Users\Abhi\Desktop\forum-dl-develop\forum_dl\extractors\hyperkitty.py", line 7, in <module>
    import dateparser
ModuleNotFoundError: No module named 'dateparser'
mikwielgus commented 12 months ago

How did you install forum-dl? If via pip install forum-dl as in the instructions you should have had dateparser installed.

mikwielgus commented 12 months ago

Aw, sorry, scratch my last. Above you wrote you installed it via pip install forum-dl, I missed that.

So just to be sure, I'd run the python -m forum_dl (...) command in a different directory. The problem may be due to forum_dl pointing to your local development version instead of the one installed via PIP.

abhiiously commented 12 months ago

Aw, sorry, scratch my last. Above you wrote you installed it via pip install forum-dl, I missed that.

So just to be sure, I'd run the python -m forum_dl (...) command in a different directory. The problem may be due to forum_dl pointing to your local development version instead of the one installed via PIP.

I just moved the folder to my documents forlder to try this, and it gave the same error

mikwielgus commented 12 months ago

OK, this is weird. I suggest that you install dateparser manually (pip install dateparser) and any other missing modules in case of further ModuleNotFoundErrors and see what happens then.

abhiiously commented 12 months ago

pip install dateparser

I got this warning when installing this. I wonder if this may be causing any issues?

 WARNING: The script dateparser-download.exe is installed in 'C:\Users\Abhi\AppData\Roaming\Python\Python310\Scripts' which is not on PATH.
mikwielgus commented 12 months ago

pip install dateparser

I got this warning when installing this. I wonder if this may be causing any issues?

 WARNING: The script dateparser-download.exe is installed in 'C:\Users\Abhi\AppData\Roaming\Python\Python310\Scripts' which is not on PATH.

I don't know, but you're welcome to investigate. I don't have any Windows installation to investigate this at the moment.

abhiiously commented 12 months ago

I added that location to PATH and tried running python -m forum_dl again but I got this same error

 python -m forum_dl "https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/"
INFO:root:GET https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058 {} {}
INFO:root:GET https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/ {} {}
INFO:root:GET https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/viewforum.php {} {}
Traceback (most recent call last):
  File "C:\Program Files\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\__main__.py", line 5, in <module>
    sys.exit(forum_dl.main())
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\__init__.py", line 34, in main
    forumdl.download(
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\forumdl.py", line 24, in download
    self.download_url(
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\forumdl.py", line 48, in download_url
    writer.write(url)
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\writers\common.py", line 78, in write
    self.write_board(base_node)
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\writers\common.py", line 103, in write_board
    self._write_board_object(board)
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\writers\common.py", line 235, in _write_board_object
    sys.stdout.write(f"{self._serialize_entry(entry)}\n")
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\writers\jsonl.py", line 10, in _serialize_entry
    return entry.json(models_as_dict=False)
  File "C:\Users\Abhi\AppData\Roaming\Python\Python310\site-packages\typing_extensions.py", line 2562, in wrapper
    return __arg(*args, **kwargs)
  File "C:\Users\Abhi\AppData\Roaming\Python\Python310\site-packages\pydantic\main.py", line 958, in json
    raise TypeError('The `models_as_dict` argument is no longer supported; use a model serializer instead.')
TypeError: The `models_as_dict` argument is no longer supported; use a model serializer instead.
mikwielgus commented 12 months ago

This is a different error. Turns out Pydantic V2 was released recently and had the models_as_dict argument removed. For now, you should be able to workaround this by downgrading Pydantic: pip install pydantic==1.10.12.

abhiiously commented 12 months ago

This is a different error. Turns out Pydantic V2 was released recently and removed the models_as_dict argument. For now, you should be able to workaround this by downgrading Pydantic: pip install pydantic==1.10.12.

I really hope I am not bothering you. Thank you for all the help. Its giving this error now

PS C:\Users\Abhi\Documents\forum-dl-develop> python -m forum_dl "https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/" --no-files
INFO:root:GET https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058 {} {}
INFO:root:GET https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/ {} {}
INFO:root:GET https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/viewforum.php {} {}
{"generator": "forum-dl", "version": "0.3.0", "extractor": "phpbb", "download_time": "2023-09-16T23:21:24.933444+00:00", "type": "board", "item": {"path": [], "url": "https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/", "origin": "https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/", "data": {}, "title": ""}}
WARNING:root:AttributeSearchError(<img alt="" class="signature-preview signature-image bbImage lazyload" data-disable-lightbox="1" data-src="https://media.giphy.com/media/mt5W8R362OzyE/giphy.gif" data-url="https://media.giphy.com/media/mt5W8R362OzyE/giphy.gif" data-zoom-target="1" style=""/>, 'src')
WARNING:root:Traceback (most recent call last):
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\extractors\common.py", line 341, in _fetch_board_threads
    self.board_state = yield from self._fetch_board_page_threads(
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\extractors\common.py", line 421, in _fetch_board_page_threads
    yield from self._extract_file_objects((), (), soup, response)
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\extractors\common.py", line 533, in _extract_file_objects
    url = urljoin(response.url, embed.get("src"))
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\soup.py", line 144, in get
    raise AttributeSearchError(self.tag, key)
forum_dl.exceptions.AttributeSearchError: (<img alt="" class="signature-preview signature-image bbImage lazyload" data-disable-lightbox="1" data-src="https://media.giphy.com/media/mt5W8R362OzyE/giphy.gif" data-url="https://media.giphy.com/media/mt5W8R362OzyE/giphy.gif" data-zoom-target="1" style=""/>, 'src')
mikwielgus commented 12 months ago

I really hope I am not bothering you. Thank you for all the help.

Multi-platform packaging and dependency issues are always a mess. Your error reports are welcome.

Its giving this error now

This is a bug with XenForo forum detection. I'll try to deal with this in a few days, shouldn't be difficult to fix.

abhiiously commented 12 months ago

I really hope I am not bothering you. Thank you for all the help.

Multi-platform packaging and dependency issues are always a mess. Your error reports are welcome.

Its giving this error now

This is a bug in the XenForo extractor. I'll try to deal with this in a few days, shouldn't be difficult to fix.

Awesome! Thank you! Ill keep an eye out on the project :)

mikwielgus commented 12 months ago

7bc41e306dd967eadf64046b15972da7311f3d26 should fix this, let me know if there are any other problems.