mikwielgus / forum-dl

Scrape posts, threads from forums, news aggregators, mail archives, export to JSONL, mailbox, WARC
MIT License
74 stars 1 forks source link

The term 'forum-dl' is not recognized #15

Closed abhiiously closed 1 year ago

abhiiously commented 1 year ago

Hello! Any time I run forum-dl "https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/"

in powershell, it gives me this error.

The term 'forum-dl' is not recognized as the name of a cmdlet, function, script file, or operable program.
Check the spelling of the name, or if a path was included, verify that the path is correct and try again.
At line:1 char:1
+ forum-dl "https://www.kanyetothe.com/threads/whats-on-your-mind-also- ...
+ ~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (forum-dl:String) [], CommandNotFoundException
    + FullyQualifiedErrorId : CommandNotFoundException

Any idea what I am doing wrong? For reference, I installed this via "pip install forum-dl" which worked, but when I run the forum-dl command from that folder it gives me the error.

abhiiously commented 1 year ago

If I run the command using forumdl.py this is the error I get

PS C:\Users\Abhi\Desktop\forum-dl-develop\forum_dl> python .\forumdl.py "https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/"

Traceback (most recent call last):
  File "C:\Users\Abhi\Desktop\forum-dl-develop\forum_dl\forumdl.py", line 7, in <module>
    from . import extractors
ImportError: attempted relative import with no known parent package
mikwielgus commented 1 year ago

It may be a Powershell-specific problem. Does invoking

python3 -m forum_dl "https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/"

instead make any difference?

abhiiously commented 1 year ago

It may be a Powershell-specific problem. Does invoking

python3 -m forum_dl "https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/"

instead make any difference?

I just tired and it said python3 was not found.

I then tried without python3 and just did python and it gave this error


PS C:\Users\Abhi\Desktop\forum-dl-develop> python -m forum_dl "https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/"
Traceback (most recent call last):
  File "C:\Program Files\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\Abhi\Desktop\forum-dl-develop\forum_dl\__main__.py", line 5, in <module>
    sys.exit(forum_dl.main())
  File "C:\Users\Abhi\Desktop\forum-dl-develop\forum_dl\__init__.py", line 34, in main
    forumdl.download(
  File "C:\Users\Abhi\Desktop\forum-dl-develop\forum_dl\forumdl.py", line 24, in download
    self.download_url(
  File "C:\Users\Abhi\Desktop\forum-dl-develop\forum_dl\forumdl.py", line 40, in download_url
    extractor = extractors.find(url, session_options, extractor_options)
  File "C:\Users\Abhi\Desktop\forum-dl-develop\forum_dl\extractors\__init__.py", line 32, in find
    for cls in list_classes():
  File "C:\Users\Abhi\Desktop\forum-dl-develop\forum_dl\extractors\__init__.py", line 44, in list_classes
    module = __import__(module_name, globals_, None, (), 1)
  File "C:\Users\Abhi\Desktop\forum-dl-develop\forum_dl\extractors\hyperkitty.py", line 7, in <module>
    import dateparser
ModuleNotFoundError: No module named 'dateparser'
mikwielgus commented 1 year ago

How did you install forum-dl? If via pip install forum-dl as in the instructions you should have had dateparser installed.

mikwielgus commented 1 year ago

Aw, sorry, scratch my last. Above you wrote you installed it via pip install forum-dl, I missed that.

So just to be sure, I'd run the python -m forum_dl (...) command in a different directory. The problem may be due to forum_dl pointing to your local development version instead of the one installed via PIP.

abhiiously commented 1 year ago

Aw, sorry, scratch my last. Above you wrote you installed it via pip install forum-dl, I missed that.

So just to be sure, I'd run the python -m forum_dl (...) command in a different directory. The problem may be due to forum_dl pointing to your local development version instead of the one installed via PIP.

I just moved the folder to my documents forlder to try this, and it gave the same error

mikwielgus commented 1 year ago

OK, this is weird. I suggest that you install dateparser manually (pip install dateparser) and any other missing modules in case of further ModuleNotFoundErrors and see what happens then.

abhiiously commented 1 year ago

pip install dateparser

I got this warning when installing this. I wonder if this may be causing any issues?

 WARNING: The script dateparser-download.exe is installed in 'C:\Users\Abhi\AppData\Roaming\Python\Python310\Scripts' which is not on PATH.
mikwielgus commented 1 year ago

pip install dateparser

I got this warning when installing this. I wonder if this may be causing any issues?

 WARNING: The script dateparser-download.exe is installed in 'C:\Users\Abhi\AppData\Roaming\Python\Python310\Scripts' which is not on PATH.

I don't know, but you're welcome to investigate. I don't have any Windows installation to investigate this at the moment.

abhiiously commented 1 year ago

I added that location to PATH and tried running python -m forum_dl again but I got this same error

 python -m forum_dl "https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/"
INFO:root:GET https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058 {} {}
INFO:root:GET https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/ {} {}
INFO:root:GET https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/viewforum.php {} {}
Traceback (most recent call last):
  File "C:\Program Files\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\__main__.py", line 5, in <module>
    sys.exit(forum_dl.main())
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\__init__.py", line 34, in main
    forumdl.download(
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\forumdl.py", line 24, in download
    self.download_url(
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\forumdl.py", line 48, in download_url
    writer.write(url)
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\writers\common.py", line 78, in write
    self.write_board(base_node)
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\writers\common.py", line 103, in write_board
    self._write_board_object(board)
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\writers\common.py", line 235, in _write_board_object
    sys.stdout.write(f"{self._serialize_entry(entry)}\n")
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\writers\jsonl.py", line 10, in _serialize_entry
    return entry.json(models_as_dict=False)
  File "C:\Users\Abhi\AppData\Roaming\Python\Python310\site-packages\typing_extensions.py", line 2562, in wrapper
    return __arg(*args, **kwargs)
  File "C:\Users\Abhi\AppData\Roaming\Python\Python310\site-packages\pydantic\main.py", line 958, in json
    raise TypeError('The `models_as_dict` argument is no longer supported; use a model serializer instead.')
TypeError: The `models_as_dict` argument is no longer supported; use a model serializer instead.
mikwielgus commented 1 year ago

This is a different error. Turns out Pydantic V2 was released recently and had the models_as_dict argument removed. For now, you should be able to workaround this by downgrading Pydantic: pip install pydantic==1.10.12.

abhiiously commented 1 year ago

This is a different error. Turns out Pydantic V2 was released recently and removed the models_as_dict argument. For now, you should be able to workaround this by downgrading Pydantic: pip install pydantic==1.10.12.

I really hope I am not bothering you. Thank you for all the help. Its giving this error now

PS C:\Users\Abhi\Documents\forum-dl-develop> python -m forum_dl "https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/" --no-files
INFO:root:GET https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058 {} {}
INFO:root:GET https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/ {} {}
INFO:root:GET https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/viewforum.php {} {}
{"generator": "forum-dl", "version": "0.3.0", "extractor": "phpbb", "download_time": "2023-09-16T23:21:24.933444+00:00", "type": "board", "item": {"path": [], "url": "https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/", "origin": "https://www.kanyetothe.com/threads/whats-on-your-mind-also-on-ktt2.4044058/", "data": {}, "title": ""}}
WARNING:root:AttributeSearchError(<img alt="" class="signature-preview signature-image bbImage lazyload" data-disable-lightbox="1" data-src="https://media.giphy.com/media/mt5W8R362OzyE/giphy.gif" data-url="https://media.giphy.com/media/mt5W8R362OzyE/giphy.gif" data-zoom-target="1" style=""/>, 'src')
WARNING:root:Traceback (most recent call last):
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\extractors\common.py", line 341, in _fetch_board_threads
    self.board_state = yield from self._fetch_board_page_threads(
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\extractors\common.py", line 421, in _fetch_board_page_threads
    yield from self._extract_file_objects((), (), soup, response)
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\extractors\common.py", line 533, in _extract_file_objects
    url = urljoin(response.url, embed.get("src"))
  File "C:\Users\Abhi\Documents\forum-dl-develop\forum_dl\soup.py", line 144, in get
    raise AttributeSearchError(self.tag, key)
forum_dl.exceptions.AttributeSearchError: (<img alt="" class="signature-preview signature-image bbImage lazyload" data-disable-lightbox="1" data-src="https://media.giphy.com/media/mt5W8R362OzyE/giphy.gif" data-url="https://media.giphy.com/media/mt5W8R362OzyE/giphy.gif" data-zoom-target="1" style=""/>, 'src')
mikwielgus commented 1 year ago

I really hope I am not bothering you. Thank you for all the help.

Multi-platform packaging and dependency issues are always a mess. Your error reports are welcome.

Its giving this error now

This is a bug with XenForo forum detection. I'll try to deal with this in a few days, shouldn't be difficult to fix.

abhiiously commented 1 year ago

I really hope I am not bothering you. Thank you for all the help.

Multi-platform packaging and dependency issues are always a mess. Your error reports are welcome.

Its giving this error now

This is a bug in the XenForo extractor. I'll try to deal with this in a few days, shouldn't be difficult to fix.

Awesome! Thank you! Ill keep an eye out on the project :)

mikwielgus commented 1 year ago

7bc41e306dd967eadf64046b15972da7311f3d26 should fix this, let me know if there are any other problems.