python / planet

Configuration for Python planets (e.g. http://planetpython.org)
133 stars 174 forks source link

Create a script to remove dead, deprecated or invalid links from planet #233

Open pydanny opened 7 years ago

pydanny commented 7 years ago
rochacbruno commented 7 years ago

I suggested to create a validator script to run monthly https://github.com/python/planet/issues/124#issuecomment-195478030

That script would take each URL check its status and validate it in RSS validator.

rochacbruno commented 7 years ago

@pydanny I changed the title of your issue to address the automation of this task.

rochacbruno commented 7 years ago

Script should do

pydanny commented 7 years ago

Pysoy and the others I mentioned aren't in the general feeds. They are in the "python libraries" and "python planets" lists. Will these be covered by your script or is that just for RSS feeds?

As for your script, go ahead and remove my blog from your list already. I say that because it's going to fail your W3C validator check and I don't have the luxury of time to fix it. Do keep in mind that this aggregator is the ONLY one I know of that insists on W3C validation.

rochacbruno commented 7 years ago

@pydanny yeah we must discuss if using w3c validator is a good choice or not, we only need to check if the feed doesn't break the planet as it happened in the past. Maybe we can write our own simple validator so we do not need to rely on w3c one.

tjguk commented 7 years ago

For my info -- what's the issue with the W3C validator? Is it too strict?

pydanny commented 7 years ago

Yes, it's too strict. No other aggregator I'm on blocks my RSS feed.

pybites commented 7 years ago

The cleanup and validation could be nice PyBites challenges -https://pybit.es/pages/challenges.html

What do you think?

pybites commented 6 years ago

We asked our community to give this a crack https://pybit.es/codechallenge49.html

mridubhatnagar commented 6 years ago

Hi

Can I be assigned this issue to work on. I don't really know the complexity of the issue. But, I am curious to work on the same.

Based on the above discussion what I could understand is

Links which are no more working should be removed.

I am currently tied up with some other stuff. But, after a month shall start working on it.

Can I please be assigned this. It looks interesting.

Thanks

mridubhatnagar commented 6 years ago

Also, are we validating a RSS feed based on the W3 validator? Or we consider a RSS feed to be valid if the feedparser is giving us no error? Or we are relying on none, And instead create one of our own and validate?

rochacbruno commented 6 years ago

@mridubhatnagar the only problem with W3 is that it considers podcast feeds invalid, so we need to have a podcast feed validator.

mridubhatnagar commented 6 years ago

@rochacbruno Fair enough. I would like to give it a shot.

Also, I think some RSS feeds are valid. But, still not showing up on planet. Like @pybites live feed.

I am not sure though if feedparser works for podcast feeds or not.

I will code podcast feed validator.