owncloud-archive / news

:newspaper: News app for ownCloud
GNU Affero General Public License v3.0
290 stars 106 forks source link

Rendering of rss feed from http://export.arxiv.org/rss/cs is wrong. #287

Closed arm2arm closed 11 years ago

arm2arm commented 11 years ago

The author names are not rendered on the main view. How to repeat: 1) Install OC 5.0.9 2) enable News App, 3) enable "App Framework"

add new address: http://export.arxiv.org/rss/cs BUG: The rendered Author names are in plain text, they should be html parsed. snapshot1

BernhardPosselt commented 11 years ago

In that case the server should remove all HTML tags from the title and author.

How to do this

Regex which matches HTML tags:

</?[^>]*>

Overwrite the setter for authorn and title and execute the regex in https://github.com/owncloud/news/blob/master/db/item.php#L37

Dont forget to also call the parent method with the sanitized author and title

like:

public function setAuthor($author) {
    $sanitizedAuthor = ... // remove all html via regex on this line
    parent::setAuthor($sanitizedAuthor);
}
cosenal commented 11 years ago

@Raydiation I am not sure what's the result you want then. You don't want to tamper the content, do you?

BernhardPosselt commented 11 years ago

Just want to remove html from the author and title (we already remove html partly for the title via js, because we offer an API we should fix this on the serverside though)

BernhardPosselt commented 11 years ago

@zimba12 please also fix #151

bantu commented 11 years ago

This sounds just as bad as undoing the escaping twice. Please just follow whatever the standards say, otherwise you are causing more problems than you are solving.

Github does similar things in these comment fields and it's an incredibly stupid idea. You just can not leave simple text comments such as <a> missing anymore without using inline code or so.

BernhardPosselt commented 11 years ago

Heres something of value http://at2.php.net/strip_tags no regex needed

BernhardPosselt commented 11 years ago

Heres one complete method:

public function setAuthor($author) {
    parent::setAuthor(strip_tags($author));
}
bantu commented 11 years ago

For the record: The root cause of this problem is that the RSS specification does not say whether the content we receive is plaintext or HTML. See for example http://www.sixapart.com/blog/2003/06/why_we_need_ech.html