getnikola / nikola

A static website and blog generator
https://getnikola.com/
MIT License
2.58k stars 443 forks source link

pyrss2gen is abandoned #3768

Open dvzrv opened 3 months ago

dvzrv commented 3 months ago

Environment

Python Version: 3.11/3.12

Nikola Version: 8.3.0

Operating System: Arch Linux

Description:

Hi! :wave: I package this project for Arch Linux. While rebuilding for Python 3.12 I noticed that the pyrss2gen upstream is very abandoned and hasn't been touched since 2013.

Its current upstream website (http://dalkescientific.com/Python/PyRSS2Gen.html) doesn't provide TLS and although BSD-3-Clause licensed, the license file is not contained in the sdist tarball.

On Arch Linux we have (for several reasons) decided to start building from upstream provided, auto-generated source tarballs or pinned git commits/tags instead.

As I don't deem pyrss2gen a future proof library, I'd be happy if you would choose another for use in nikola (so that I can remove the dependency from the repositories) or if you would consider adopting it.

Kwpolska commented 3 months ago

PyRSS2Gen is abandoned, but the other libraries don’t seem to look much better. We won’t implement RSS generation from scratch. If a new Python release manages to break it, we’ll vendor it and fix it (although it has so few dependencies that even the CPython deletionists are very unlikely to break it).

If anyone wants to contribute a migration to another maintained library (perhaps https://pypi.org/project/feedgenerator/ ?), we would be happy to merge it.

sjehuda commented 2 months ago

I use feedgenerator in many scripts I have.

Even though it lacks graphics and does not yet allow multiple enclosures it does a great job.

I would recommend it and it is easy to utilize.

Support for enclosures and graphics would be easy to correct.

I use it to generate Atom 1.0 feeds from HTML and I think it is one of the bests.

May I offer an assistance on this matter?

Kwpolska commented 2 months ago

@sjehuda Pull requests are always welcome.

sjehuda commented 2 months ago

We won’t implement RSS generation from scratch.

I am still reading the code and also file config.py (The configuration file is good), and I see that Nikola has its own code for generating Atom feeds (line 2448) using lxml.etree.Element.

I can use library feedgenerator and I can also copy the Atom code from nikola.py and make an RSS from it. Here are options for you to choose and I will follow your lead.

Option 1: Make RSS from scratch

Copy the Atom code from module nikola.py and make an RSS from it.

Option 2: Make RSS from scratch and create a new module

Create a new module for Nikola (e.g. feeds.py) in which all creation of feeds be occur at.

Option 3: Use library

Use feedgenerator for both Atom and RSS.

Option 4: Remove RSS and fix plugin Gallery

Remove RSS entirely in favour of Atom, and add Atom support to module gallery.

Option 5: Remove RSS, create a new module and fix plugin Gallery

Remove RSS entirely in favour of Atom, create a new module for Nikola (e.g. feeds.py) in which all creation of feeds be occur at, and add Atom support to module gallery.

Of note:

Kwpolska commented 2 months ago
  1. Removing RSS support is unacceptable. Wikipedia seems to have RSS feeds, and it is supported by many feed readers (and I initially wrote RSS readers as it’s so ingrained).
  2. Using feedgenerator for Atom would be a good idea, as long as this doesn’t negatively affect the output feeds. (Minor changes that wouldn’t affect typical clients and that wouldn’t cause clients to consider all entries to be new are fine.)
  3. Adding Atom support to galleries (while keeping RSS feeds) is good too.
  4. A redesign that makes the feed generation into plugins and allows adding extra feed formats would be fine.
  5. Extra feed types and standards:

    a. ActivityPub — does it even work for static sites? What would be the benefit for implementing it? (If feeds are pluggable, this could be a plugin in getnikola/plugins) b. Gemini Feed — Nikola targets HTTPS/HTML only and support for anything Gemini will not be added. (And "Gemini Feed" is really ungoogleable, BTW.) c. OPML — this would be fine, but I think it would actually be a separate thing from feed generation code-wise?

sjehuda commented 2 months ago

Using feedgenerator for Atom would be a good idea, as long as this doesn’t negatively affect the output feeds.

I did notice that Nikola makes XML processing of its own, after receiving a produced object from PyRSS2Gen, and it adds stylesheet element to RSS which is not provided by PyRSS2Gen, so I suppose every missing element Nikola desires to have can be complemented within Nikola itself, if required, albeit it would be preferable to collaborate with @getpelican and add the missing features to feedgenerator.

(Minor changes that wouldn’t affect typical clients and that wouldn’t cause clients to consider all entries to be new are fine.)

I do not know how every client checks for seen feeds (there are Links and IDs that are probably the most used elements for making this check, which is what I do with an XMPP chatbot I have made).

Adding Atom support to galleries (while keeping RSS feeds) is good too.

I might take that task for later, especially in concern to media objects as enclosures and HTML.

A redesign that makes the feed generation into plugins and allows adding extra feed formats would be fine.

Unless I am misunderstanding, same comment as per collaboration with feedgenerator.

a. ActivityPub — does it even work for static sites? What would be the benefit for implementing it? (If feeds are pluggable, this could be a plugin in getnikola/plugins)

I am not sure what ActivityPub is, as I did not invest time to investigate it yet.

A similar platform GNU Social (was StatusNet), which - I think - also communicates using ActivityPub, is providing Atom, RDF, RSS and a filetype called ActivityStream (archived version) which I decided to add support for in a userscript I have made. That is to say, if it is structured, and I can read it, then I will read it, even if someone would tell me it was not meant to be read.

image

b. Gemini Feed — Nikola targets HTTPS/HTML only and support for anything Gemini will not be added. (And "Gemini Feed" is really ungoogleable, BTW.)

I am new to Gemini and Gopher and they seem fun, so I figured adding Gemini support could be a future consideration.

c. OPML — this would be fine, but I think it would actually be a separate thing from feed generation code-wise?

I am a lawyer, not an expert software engineer, so I mostly follow advises of experts, which would be you.

What I really do is breaking walls, and, even though I attempt to do the best as I possibly can, I therefore would expect to receive criticism and suggestions for improvements of the work I do and ideas I provide.

sjehuda commented 2 months ago

Using feedgenerator for Atom would be a good idea, as long as this doesn’t negatively affect the output feeds.

This is the first time I work to generate RSS, which means that aside from parsing RSS, I do not work with RSS (until today).

However, I noticed that Nikola uses rss_obj.rss_attrs of PyRSS2Gen to add custom attributes.

As I look at feedgenerator, I see that class RssFeed(SyndicationFeed): has a function called rss_attributes which appears to be the equivalent of rss_obj.rss_attrs, so I do not think we would suffer any undesired change to feeds.

    def rss_attributes(self):
        return {'version': self._version}

And Atom is pretty much constant, so I do not expect any issue.

sjehuda commented 2 months ago

I do not want to do something which is not in accord to your plan.

I am waiting for your respond.

I am eager to take this task.

Kwpolska commented 2 months ago

What questions need a response?

sjehuda commented 2 months ago

On Mon, 29 Apr 2024 11:18:37 -0700 Chris Warrick @.***> wrote:

What questions need a response?

1) Is this statement (part of a previous comment) correct?

I did notice that Nikola makes XML processing of its own...

That is to say, did I understand the code or not?

2) Concerning your comment:

A redesign that makes the feed generation into plugins and allows adding extra feed formats would be fine.

Do you prefer to have Syndication feeds as plugin?

Kwpolska commented 2 months ago
  1. Yes, there is some post-processing on feeds. It’s fine to keep it this way.
  2. If it is viable to make them into plugins, they can be plugins. But they can also stay the way they are now, as parts of the core.
sjehuda commented 2 months ago

Great.

I will start with feedgenerator and compare resulted outputs.

Thank you