lkiesow / python-feedgen

Python module to generate ATOM feeds, RSS feeds and Podcasts.
https://feedgen.kiesow.be/
BSD 2-Clause "Simplified" License
739 stars 124 forks source link

Unable to set categories #99

Closed steinarb closed 11 months ago

steinarb commented 4 years ago

I'm working on a script that reverses RSS feeds https://github.com/steinarb/feedreverser

I'm trying to reverse a wordpress feed so that I can use feediverse to post the feed entries in cronological order.

I parse the wordpress feed with feedparser 5.2.1 I'm using python-feedgen 0.9.0 to output the reversed feed.

Categories don't survive in the feed reversal.

This is what I do to set the categories: https://github.com/steinarb/feedreverser/blob/28f53647bf78ee7015fdbcee52dd0bb8c437d496/feedreverser.py#L26

I've printed out the categories and they look OK to me.

Reading an RSS feed the categories from feedparser looks like this:

[{'term': 'Emacs', 'scheme': None, 'label': None}, {'term': 'editor', 'scheme': None, 'label': None}, {'term': 'emacs', 'scheme': None, 'label': None}, {'term': 'extension', 'scheme': None, 'label': None}, {'term': 'lisp', 'scheme': None, 'label': None}, {'term': 'programming', 'scheme': None, 'label': None}]

The resulting RSS entry produced by python-feedgen looks like this

<item><title>Installing debian “squeeze” with PXE boot on a Samsung N145 Plus netbook</title><description>Introduction This article describes the steps necessary to install debian 6 &amp;#8220;squeeze&amp;#8221; on a Samsung N145 Plus netbook, with the following specification: Intel Atom processor 10.1&amp;#8243; display 1GB RAM 340GB HDD Windows 7 preinstalled Setting up netboot of the debian installer DHCP requests in my home LAN network is provided by dnsmasq on a desktop PC &amp;#8230; &lt;a href="https://steinar.bang.priv.no/2012/06/11/installing-debian-squeeze-with-pxe-boot-on-a-samsung-n145-plus-netbook/" class="more-link"&gt;Continue reading &lt;span class="screen-reader-text"&gt;Installing debian &amp;#8220;squeeze&amp;#8221; with PXE boot on a Samsung N145 Plus netbook&lt;/span&gt; &lt;span class="meta-nav"&gt;&amp;#8594;&lt;/span&gt;&lt;/a&gt;</description><guid isPermaLink="false">http://steinar.bang.priv.no/?p=63</guid><category/><category/><category/><category/><category/><category/><category/><category/><category/><category/><category/><pubDate>Sat, 13 Jun 2020 11:02:56 +0000</pubDate></item>

Reading an atom feed, the categories from feedparser looks like this:

[{'term': 'Emacs', 'scheme': 'https://steinar.bang.priv.no', 'label': None}, {'term': 'editor', 'scheme': 'https://steinar.bang.priv.no', 'label': None}, {'term': 'emacs', 'scheme': 'https://steinar.bang.priv.no', 'label': None}, {'term': 'extension', 'scheme': 'https://steinar.bang.priv.no', 'label': None}, {'term': 'lisp', 'scheme': 'https://steinar.bang.priv.no', 'label': None}, {'term': 'programming', 'scheme': 'https://steinar.bang.priv.no', 'label': None}]

The resulting RSS entry produced by python-feedgen looks like this

<item><title>Emacs and lisp</title><description>Introduction The Emacs text editor uses lisp as an extension language. This article will attempt to explain enough lisp to do basic emacs customization, to someone who knows imperative programming languages. Evaluating lisp Lisp consists of balanced pairs of parantheses, filled with tokens, separated by space, eg. like this: (somefun1 1 2 3 "four") (somefun2 &amp;#8230; &lt;a href="https://steinar.bang.priv.no/2012/05/03/emacs-and-lisp/" class="more-link"&gt;Continue reading &lt;span class="screen-reader-text"&gt;Emacs and lisp&lt;/span&gt; &lt;span class="meta-nav"&gt;&amp;#8594;&lt;/span&gt;&lt;/a&gt;</description><guid isPermaLink="false">http://steinar.bang.priv.no/?p=23</guid><category domain="https://steinar.bang.priv.no"/><category domain="https://steinar.bang.priv.no"/><category domain="https://steinar.bang.priv.no"/><category domain="https://steinar.bang.priv.no"/><category domain="https://steinar.bang.priv.no"/><category domain="https://steinar.bang.priv.no"/><pubDate>Sat, 13 Jun 2020 11:20:42 +0000</pubDate></item>
steinarb commented 4 years ago

I succeeded in getting a workaround for this issue by just setting the term. https://github.com/steinarb/feedreverser/commit/e435625a7326ddd765844d1cf4e5a732b108d753

And the term is the only thing feediverse use in the hashtags of the toots, so this is good enough for me .

But not getting the complete category may be an issue for someone needing a real reversed RSS feed.

lkiesow commented 11 months ago

Since you can access the term via tag.term it seems like entry.tags is not a list of dictionaries in your example. That's what feedgen is expecting though. In any case, good to see that you found a solution.