Owyn / CSS2RSS

scrapper script for RSSGuard to make an RSS feed for any website using CSS
21 stars 2 forks source link

MangaPark #10

Open hercyle opened 1 month ago

hercyle commented 1 month ago

How do I get RSS feeds from MangaPark? https://mangapark.net/title/10953-en-one-piece

Owyn commented 1 month ago

python css2rss.py "div.scrollable-panel a.visited\:text-accent" "!One Piece"

hercyle commented 1 month ago

i couldnt fetch the metadata nor the chapters after giving it a title and saving it. here is a messy traceback log. also, im curious about all your manga/hwa/hua sources, it'll save me a lot of time if i had them.

time="    34.968" type="debug" -> core: We will process feed data with post-process script 'python css2rss.py "div.scrollable-panel a.visited\:text-accent" "!One Piece"'.
time="    36.478" type="critical" -> core: Post-processing script for feed file failed: 'script threw an error: '
Traceback (most recent call last): 
File "/home/user/.config/RSS Guard 4/css2rss.py", line 146, in <module> 
found_items = soup.select(sys.argv[1]) 
^^^^^^^^^^^^^^^^^^^^^^^^ 
File "/usr/lib/python3.12/site-packages/bs4/element.py", line 2116, in 
select return self.css.select(selector, namespaces, limit, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
File "/usr/lib/python3.12/site-packages/bs4/css.py", line 162, in select self.api.select( File "/usr/lib/python3.12/site-packages/soupsieve/__init__.py", line 147, in select return compile(select, namespaces, flags, **kwargs).select(tag, limit) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/usr/lib/python3.12/site-packages/soupsieve/__init__.py", line 65, in compile return cp._cached_css_compile( ^^^^^^^^^^^^^^^^^^^^^^^ 
File "/usr/lib/python3.12/site-packages/soupsieve/css_parser.py", line 210, in _cached_css_compile ).process_selectors(), ^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.12/site-packages/soupsieve/css_parser.py", line 1138, in process_selectors return self.parse_selectors(self.selector_iter(self.pattern), index, flags) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
File "/usr/lib/python3.12/site-packages/soupsieve/css_parser.py", line 982, in parse_selectors has_selector, is_html = self.parse_pseudo_class(sel, m, has_selector, iselector, is_html) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
File "/usr/lib/python3.12/site-packages/soupsieve/css_parser.py", line 658, in parse_pseudo_class raise SelectorSyntaxError( soupsieve.util.SelectorSyntaxError: ':text-accent' was detected as a pseudo-class and is either unsupported or invalid. If the syntax was not intended to be recognized as a pseudo-class, please escape the colon. line 1: div.scrollable-panel a.visited:text-accent ^''.
time="    36.482" type="critical" -> network: Error when fetching feed: 'Feed::Status::OtherError' message: 'script threw an error: 'Traceback (most recent call last): File "/home/user/.config/RSS Guard 4/css2rss.py", line 146, in <module> found_items = soup.select(sys.argv[1]) ^^^^^^^^^^^^^^^^^^^^^^^^ 
File "/usr/lib/python3.12/site-packages/bs4/element.py", line 2116, in select return self.css.select(selector, namespaces, limit, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.12/site-packages/bs4/css.py", line 162, in select self.api.select( File "/usr/lib/python3.12/site-packages/soupsieve/__init__.py", line 147, in select return compile(select, namespaces, flags, **kwargs).select(tag, limit) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
File "/usr/lib/python3.12/site-packages/soupsieve/__init__.py", line 65, in compile return cp._cached_css_compile( ^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.12/site-packages/soupsieve/css_parser.py", line 210, in _cached_css_compile ).process_selectors(), ^^^^^^^^^^^^^^^^^^^ 
File "/usr/lib/python3.12/site-packages/soupsieve/css_parser.py", line 1138, in process_selectors return self.parse_selectors(self.selector_iter(self.pattern), index, flags) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.12/site-packages/soupsieve/css_parser.py", line 982, in parse_selectors has_selector, is_html = self.parse_pseudo_class(sel, m, has_selector, iselector, is_html) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
File "/usr/lib/python3.12/site-packages/soupsieve/css_parser.py", line 658, in parse_pseudo_class raise SelectorSyntaxError( soupsieve.util.SelectorSyntaxError: ':text-accent' was detected as a pseudo-class and is either unsupported or invalid. If the syntax was not intended to be recognized as a pseudo-class, please escape the colon. line 1: div.scrollable-panel a.visited:text-accent ^''.
Owyn commented 1 month ago

You aren't supposed to give it a title nor dig any logs, make sure you have installed it correctly then just click "fetch metadata" after inputting url & script correctly image

hercyle commented 1 month ago

the issue is with rssguard latest version

Version: 4.7.4 (built on Linux/x86_64)
Revision: 68c322710-lite
Build date: 9/26/24 7:52 PM
OS: Arch Linux
Qt: 6.7.3 (compiled against 6.7.2)

and it works with downgraded rssguard-lite

Version: 4.3.0 (built on Linux/x86_64)
Revision: -nowebengine
Build date: 1/21/23 7:34 PM
Qt: 6.7.3 (compiled against 6.4.2)
Owyn commented 1 month ago

Well, I'm not a developer of Rssguard, @martinrotter is, so you'll have to create an issue there https://github.com/martinrotter/rssguard/issues since it should work the same way on all versions as far as I know...

it seems to be a problem with escaping disappearing (?) in the new RSSguard version:

SelectorSyntaxError( soupsieve.util.SelectorSyntaxError: ':text-accent' was detected as a pseudo-class and is either unsupported or invalid. If the syntax was not intended to be recognized as a pseudo-class, please escape the colon.
Owyn commented 1 month ago

also, im curious about all your manga/hwa/hua sources, it'll save me a lot of time if i had them.

Here's an rss export you can import into rssguard (after extracting) rssguard_feeds_2024-10-05.zip