Closed brokkr closed 7 years ago
Alternatively we focus on fixing 301s and simply inform user that we have updated their poca.xml to reflect the new state of things.
From what I can tell, when feedparser report a 301, it contains the relocated url under the 'href' key.
Example:
Original: http://www.bbc.co.uk/programmes/p0299wgd/episodes/downloads.rss Redirect: https://podcasts.files.bbci.co.uk/p0299wgd.rss
In general we probably want to to use status codes as our guide to what to do. The 'bozo' bit is not really that relevant as it's about being fussy with XML niceties not with actual problems.
Here's a shortlist of relevant codes:
Of course, these are all the server's possible issues with the client. The other side of the coin is the client taking issue with what the serrver is presenting:
It is worth noting that most of these don't do any harm seeing as all issues from 304 onwards result in an empty doc.entries. And so we will just stick to what we already know and have.
I think we can divide the issues into the following categories after what actions need to be taken. We save the status code as self.status to keep it in mind. If there is no status, we set if to 0:
self.status = getattr(doc, 'status', 0)
Process | Don't Process | |
---|---|---|
Report | 301. As above but after processing we call a method to notify user of new url. We may be able to write new url ourselves if we can get a lock on the conf file using lxml? | 400s and no status. As above we set a flag to skip processing but unlike 304 the user needs to know that there are issues. |
Don't Report | 200/with or without bozo. Everything is (sorta) hunky dory and we start processing. There is not much point in examining bozo as our best course oof action in any case is to process the entries we do get. If they're badly misshapen entryinfo should discard them. | 304. Skip this one |
We're still missing the reporting bit but the next commit will fully implement handling for all major status messages, including 301s which will simply update the config file's url to the new one.
What's missing from 301 treatment:
Looking over the feedparser response statuses, there are a lot of
But there are also a fair few
This ought to be conveyed to the user so that feeds can be updated.