save_stories_from_feed performance improvement

I was reading the code save_stories_from_feed in tasks.py and it looks to be making one database call per feed entry to check for duplicates.

normalized_url_exists could be replaced by a single call to the database to check all feed entries at once.

There could a function call getValidFeedEntries that would apply the logic existing in save_stories_from_feed that skips invalid entries.

Then a single database call to identify what is duplicate and then bulk insert and commit.

If it sounds reasonable I can give it a try. This looks to be the eventual bottleneck of this implementation?

mediacloud / rss-fetcher