django / djangoproject.com

Source code to djangoproject.com
https://www.djangoproject.com/
BSD 3-Clause "New" or "Revised" License
1.87k stars 943 forks source link

Django Community: New/updated Django packages feed ingestion is broken #1299

Open jefftriplett opened 1 year ago

jefftriplett commented 1 year ago

The latest items from https://www.djangoproject.com/community/packages/ are from October 27, 2017.

The DjangoPackages.org RSS appears to be working, but I'm happy to follow up if something is broken on the website: https://djangopackages.org/feeds/packages/latest/rss/

Xnapper-2023-01-03-14 04 44-46B20C07-BD97-4CF2-93BF-9C5C4CCC12A8

I'm happy to help if someone can point me to where that should be :)

related to https://github.com/django/djangoproject.com/issues/1137

carltongibson commented 1 year ago

Hey @jefftriplett — yes! I think this is likely related to https://github.com/django/djangoproject.com/issues/1137.

Everything that follows is I think

The issue is that the superfeedr service that we've been using forever is now unmaintained and it slowly rotting.

We need to replace it with our own. feedparser mapping entries into the FeedItem model, in a management command, that we can run 6hrly (say).

This shouldn't be too ops intensive, such we can't do it ourselves. But if it is I'm minded to spin up a mini-instance myself just so we can get passed this.

The relevant model is in the aggregator app.

If you break ground here, I'm very happy to input as needed. Otherwise I have it down as a possible GSoC idea (but that's still time away)

jefftriplett commented 1 year ago

@carltongibson circling back here. Do we have a dump of feed urls by chance to test with? (I have admin access, but so-many-feeds)

jefftriplett commented 5 months ago

This project might get pulled into Jazzband and might be semi-related. https://github.com/brutasse/django-push

medmunds commented 2 months ago

Now the new/updated packages feed is showing mostly gambling site spam. The last actual Django package updates listed are from 2017.

Maybe just remove it until it can be repaired?

jefftriplett commented 2 months ago

cc @bmispelon since we were both looking into this on Friday. The Django Package feed has newer links, but ingestion isn't picking up on it. I thought we'd removed the other feed that someone must have bought and squatted on.

bmispelon commented 2 months ago

I did remove some spam entries on Friday, and I don't see them anymore on the page at https://www.djangoproject.com/community/packages/

@medmunds Where do you see spam entries?

EDIT: Ah nevermind, I found them in the actual RSS feed. The cause seems to be that the FeedListView and the CommunityAggregatorFeed use different logic. I'll open a new issue for that.

medmunds commented 2 months ago

(Just for reference, what I am seeing is in the page at https://www.djangoproject.com/community/)

django-community-projects-feed-spam
jefftriplett commented 4 weeks ago

I see a new update from August 4th, 2024 this year which ~7 years in between updates. I think there is a chance this is fixed but I will leave this open until we verify it or see a few more trickle in. https://www.djangoproject.com/community/packages/