Ovear / kemono-dl

A simple kemono.party downloader using python.
109 stars 11 forks source link

Getting parse errors with the most up-to-date BeautifulSoup #13

Closed Deses closed 11 months ago

Deses commented 11 months ago

Version

Version: 2022.04.28

Description of bug

I used pip to update the requirements and now I'm getting parsing errors.

First, I updated the requirements and got these versions, which are the latest as of today.

$ pip install -U beautifulsoup4
$ pip install -U Pillow
$ pip install -U requests
$ pip install -U yt_dlp

$ pip list
Package                      Version
---------------------------- --------------------
beautifulsoup4               4.12.2
Pillow                       10.0.0
requests                     2.31.0
yt-dlp                       2023.7.6

At this point I started getting the parse errors, so I installed the suggested versions listed in requirements.txt:

$ pip install --force-reinstall beautifulsoup4==4.11.1
$ pip install --force-reinstall Pillow==9.1.0
$ pip install --force-reinstall requests==2.27.1
$ pip install --force-reinstall yt_dlp==2022.4.8

$ pip list
Package                      Version
---------------------------- --------------------
beautifulsoup4               4.11.1
Pillow                       9.1.0
requests                     2.27.1
yt-dlp                       2022.4.8

But unfortunately I still get the errors, so I reinstalled what I had previously pip install --force-reinstall beautifulsoup4==4.8.2

And I'm no longer getting the error described below. It appears there's some incompatibility with newer versions of beautifulsoup4 and Kemono-DL.

How To Reproduce

Update the requirements to the latest version and run any download command.

Error messages and tracebacks


main.py:337: MarkupResemblesLocatorWarning: The input looks more like a filename than markup. You may want to open this file and pass the filehandle into Beautiful Soup.
  content_soup = BeautifulSoup(post['content'], 'html.parser')

Additional comments

For what is worth, it appears that the program works OK if I update the other 3 requirements. (Pillow, requests, yt_dlp)

$ pip list
Package                      Version
---------------------------- --------------------
beautifulsoup4               4.8.2
Pillow                       10.0.0
requests                     2.31.0
yt-dlp                       2023.7.6
Ovear commented 11 months ago

Hi,

This is more likely a warning message that related with "post content" as you quoted which won't affect downloader's function.

Please provide a full command line that includes which post you want to download.

Deses commented 11 months ago

I didn't think the command was relevant because I'm getting it with all the downloads:

I'm using this command python kemono-dl.py --cookies coomer.cookie --quiet --filename-pattern "{title} - {index}.{ext}" --dirname-pattern downloads/XXXXXXX --links https://coomer.party/XXXXXXX

Please replace the X with any content creator of your choice.

Ovear commented 11 months ago

Unfortunately, I could not reproduce this with the command you provided.

Have you tried downloading other artists? As the same artist usually shares the same post content pattern, which may keep trigger this warning.

Actually, you could ignore this warning, and it takes no effect on downloader.

$ pip list|grep -i beau
beautifulsoup4     4.11.1
Deses commented 11 months ago

Oh, I will ignore it alright!

import warnings
from bs4 import BeautifulSoup, MarkupResemblesLocatorWarning

warnings.filterwarnings(action='ignore', category=MarkupResemblesLocatorWarning)

I added that at the beginning of main.py.

While researching how to make BS4 shut up I found a lot of people also annoyed at their condescending warning messages. I found it funny. 😂

The reason for me to suppress the warning is that I wrote a little automation script to run in cron and I just didn't want the noise in my logs. :P