Closed AlexanderMatveev closed 7 years ago
Any response please?
Well, I guess you cannot fix the feed yourself so we'll need to patch rss-atom-bundle to return a relevant date. Which we can't, because we only have few clews on when the item were actually published.
it's possible to set the feed's last modified HTTP header value, so if you think it's a good solution could you please submit a pull request in that way ?
Regards,
Alex
@alexdebril Thanks for response. But I can't fix feeds I'm parsing. Because they are third party feeds, and many of feedburner.com feeds don't have this tag at all. So after updating feeds using rss-atom-bundle old items received as new items. Is there any plans for solving this issue?
Then the only solution is to rely on the public id to filter items you already have in database. The Filter system is built for that, take a look at this interface : https://github.com/alexdebril/rss-atom-bundle/blob/master/Protocol/FilterInterface.php . Once you created your Filter class (PublicIdFilter for instance), you pass it as a parameter of getFilteredContent : https://github.com/alexdebril/rss-atom-bundle/blob/master/Protocol/FeedReader.php#L135
I hope it helped !
Alex
@alexdebril It's funny, but many feeds don't have Public ID field =D Is it possible to return null instead of current DateTime with $item->getUpdated(), if no pubDate tag is in Item? What is the logic of returning current DateTime?
@alexdebril The bundle doesn't parse valid feed (see https://validator.w3.org/feed/check.cgi?url=www.sovsport.ru%2Fnews_rss). Moved to https://github.com/eko/FeedBundle.
curl -I http://www.sovsport.ru/news_rss HTTP/1.1 200 Server: nginx/1.4.7 Date: Thu, 11 Jun 2015 21:12:56 GMT Content-Type: application/xml; charset=windows-1251 Connection: keep-alive
Usually the HTTP message ends with a message like :
curl -I http://php.net/feed.atom 7 ↵ HTTP/1.1 200 OK Server: nginx/1.6.2 Date: Thu, 11 Jun 2015 21:15:01 GMT Content-Type: application/atom+xml Content-Length: 299632 Connection: keep-alive Last-Modified: Thu, 11 Jun 2015 21:00:11 GMT ETag: "49270-5184446f72cc0" Accept-Ranges: bytes
(which is successfully parsed btw)
The cause is here : https://github.com/alexdebril/rss-atom-bundle/blob/master/Driver/HttpCurlDriver.php#L63
The regexp expects message to exist, which is the case in most cases (I've never seen HTTP/1.1 200 without OK before).
Try with https://github.com/alexdebril/rss-atom-bundle/tree/issue-68 : it works. The only difference is https://github.com/alexdebril/rss-atom-bundle/commit/9033ac5d728a0322b57c2d78bc848b7b800014ec
I'll fix that bug in the next release.
Thank you @alexdebril
Guzzle solved this, I close the issue
$item->getUpdated()
returns current DateTime if not present in RSS For example, see http://feeds.feedburner.com/esquire-ru?format=xml