Closed Jochbart closed 2 years ago
can u post the rss url
thanks.. the data is encoded, but the module does not attempt to decode it
<item>
<title>
<![CDATA[ »Thomaidis Gewerbepark Hanau« fast fertig ]]>
</title>
<link>https://www.main-echo.de/7421848?utm_source=rssfeed&utm_medium=rss&utm_campaign=rss-aschaffenburg</link>
<dc:creator>
<![CDATA[ Pressestelle der Stadt Hanau ]]>
</dc:creator>
<pubDate>Thu, 18 Nov 2021 16:54:09 +0000</pubDate>
<description>
<![CDATA[ »Da­mit er­fährt der Ha­nau­er Ha­fen den größ­ten Ent­wick­lungs­schub seit lan­gem«, freut sich Ha­n­aus Ober­bür­ger­meis­ter Claus Ka­mins­ky über die An­sied­lung des in­ha­ber­ge­führ­ten Spe­di­ti­ons­un­ter­neh­mens Gebr. Tho­mai­dis GmbH auf dem ehe­ma­li­gen Ca­bot-Ge­län­de im Ha­nau­er Ha­fen. Der Im­mo­bi­lien­in­ves­tor und Pro­jekt­ent­wick­ler Al­pha In­du­s­trial schafft dort bis En­de 2021 den »Tho­mai­dis Ge­wer­be­park Ha­nau« auf dem 3,5 Hektar gro­ßen Areal in der Jo­sef-Bautz-Stra­ße, das er vom Main-Kin­zig-Kreis er­warb. ]]>
</description>
<content:encoded>
<![CDATA[ »Da­mit er­fährt der Ha­nau­er Ha­fen den größ­ten Ent­wick­lungs­schub seit lan­gem«, freut sich Ha­n­aus Ober­bür­ger­meis­ter Claus Ka­mins­ky über die An­sied­lung des in­ha­ber­ge­führ­ten Spe­di­ti­ons­un­ter­neh­mens Gebr. Tho­mai­dis GmbH auf dem ehe­ma­li­gen Ca­bot-Ge­län­de im Ha­nau­er Ha­fen. Der Im­mo­bi­lien­in­ves­tor und Pro­jekt­ent­wick­ler Al­pha In­du­s­trial schafft dort bis En­de 2021 den »Tho­mai­dis Ge­wer­be­park Ha­nau« auf dem 3,5 Hektar gro­ßen Areal in der Jo­sef-Bautz-Stra­ße, das er vom Main-Kin­zig-Kreis er­warb. ]]>
</content:encoded>
</item>
the spec says html-entities are allowed.. these seem language specific.
­
ä
ö
«
ü
ß
but they display correctly here (5 of the 6) ä ö « ü ß
none of the decoders I can find handle those.
there is an approach it seems
create a textarea html object, post this as the html content extract the text content https://stackoverflow.com/questions/7394748/whats-the-right-way-to-decode-a-string-that-has-special-html-entities-in-it/7394787
function decodeHtml(html) {
var txt = document.createElement("textarea");
txt.innerHTML = html;
return txt.value;
}
ah, good old german umlauts :-)
but the "" dont get decoded neither :-(
the spec says html-entities are allowed.. these seem language specific.
­ ä ö « ü ß
but they display correctly here (5 of the 6) ä ö « ü ß
none of the decoders I can find handle those.
there is an approach it seems
create a textarea html object, post this as the html content extract the text content https://stackoverflow.com/questions/7394748/whats-the-right-way-to-decode-a-string-that-has-special-html-entities-in-it/7394787
function decodeHtml(html) { var txt = document.createElement("textarea"); txt.innerHTML = html; return txt.value; }
Hi sdetweil, thanks for your help. I will try to realize your mentioned solution, but I am still a big noob if it is about programming and its logic. But I cannot imagine that I am the only German user of this module and has this problem. How can this be?
I was not suggesting YOU solve this.
I confirmed the problem and the cause.
The newsreader module does not handle the encoded description.
I do not know how to resolve it. The few things I tried do not provide a fix.
@Jochbart you need to add
encoding: "ISO-8859-1",
to the section of that newsfeed in your config.js. Then the umlaute etc should appear correctly.
@Jochbart you need to add
encoding: "ISO-8859-1",
to the section of that newsfeed in your config.js. Then the umlaute etc should appear correctly.
I added this already and my news headline is right.
Happy to hear that. So this issue can be closed @Jochbart ? Or would you say that the documentation would need some clarification regarding this?
Happy to hear that. So this issue can be closed @Jochbart ? Or would you say that the documentation would need some clarification regarding this?
No, we miss understood us. I added this already before I opened this issue. It helped me to encode the news headline right, but I have this problem in the news description. You can see it in the screenshot I posted: „wächst“ is right, but the description is still wrong with html commands.
Ah I see, yes, the description still has the encoding issues.
Not sure if it is something the MM should fix (replacing the special html entities with "correct" chars) or if that is something the used iconv-lite library needs to adress.
What doy ou think @sdetweil ?
iconv is used to decode the entire fetch payload, so it 'should' have worked on the description too..
response.body.pipe(iconv.decodeStream(encoding)).pipe(parser);
but adding debug shows title not converted correctly
"title": " �PNV-Ausschuss:
which matches iconv doc
Untranslatable characters are set to � or ?. No transliteration is currently supported.
so I think something else would have to be done. and iconv has already mangled the content
I noticed, that using something like "ä" (ä) in the title field of the newsfeed-config isnt displayed correctly, while using the same chars in the title field of a calendar module entry is displayed as "ä".
Maybe this has something to do with the newsfeed module using the nunjuck templates while the calendar module creates it DOM "by hand" (with javascript)?
Will have to investigate further...
... so adding " | safe" to the tags in the nunjuck template would solve the issue and display the umlauts like they are supposed to.
Question is: Does this open up security issues in the case of a malicious newsfeed? What are your opinions about that @sdetweil @MichMich @khassel ?
Yeah that's a good one. I'm not a big fan of adding thousands of option flags. But maybe this should be based on a toggle: allowSuperUnsafeContentToBeDisplayedWhileIThrowAwayAllMySecurityAndIPromiseNotToSueAnyMagicMirrorContributor = true
😅
I have no experience with the issues raised, so no idea
my knowledge about nunjuck tends to zero, so can't say what implications the "safe" tag would have ...
Yeah that's a good one. I'm not a big fan of adding thousands of option flags. But maybe this should be based on a toggle:
allowSuperUnsafeContentToBeDisplayedWhileIThrowAwayAllMySecurityAndIPromiseNotToSueAnyMagicMirrorContributor = true
😅
Updated my PR with a new config option "dangerouslyDisableAutoEscaping" :-)
There was a similar question here - https://forum.magicmirror.builders/topic/16100/character-set-for-news-fed-text-apos/9
Here is how I solved it -
I actually made a fix for this that works fine for me - Not sure how to present it - but in the newsfeed.js make the following changes
At the end of the defaults of the newsfeed.js add the line replaceMe: [] as shown below -
logFeedWarnings: false,
replaceMe: []
},
In the getTemplateData: function () { add the following before the return { loaded: true,
basically everything in the jep section all between the //*** return item;
});
//*******
//jep to fix title for various translations such as
// a simple ' instead of showing '
// also replace things like Seattle with Seattle, WA
var tempTitle = item.title; // jepFixTitle(item.title);
for(let i = 0; i < this.config.replaceMe.length; i+= 2)
{
tempTitle = tempTitle.toString().replaceAll(this.config.replaceMe[i], this.config.replaceMe[i+1]);
}
//**********************
return {
loaded: true,
2A. In the return section - change the jep line as shown below
publishDate: moment(new Date(item.pubdate)).fromNow(), title: tempTitle, //jep see above description: item.description,
replaceMe: [ "'", "'", "Seattle", "Seattle, WA", "Biden", "(Pres) Biden", "Zuckerberg", "Zuckerberg [DATA]"]
This will replace the 1st item with the 2nd item, etc.... Add whatever translations you want (: I've had some fun with the replacements.....
(it's (the forum) reformating the 1st item in the array - it should be without space - & a p o s ; )
In that case it might be smarter to let the user set an optional transform callback. That callback can then be set in the config and can do anything you want. :)
Not sure what you mean. If you don't want it to transform though, your array in the config file can be blank (or not exist) and it won't transform anything - the default in the newsfeed.js is a blank array. But, I'm probably not understanding. I'm kind of a newbie on the MM.
he is suggesting to let you specify that replace routine and its data IN the config.js, and MM would call it if present..(optional)
that way everything is outside MM, and the routine can be as fancy or not as needed (up to the user)
transform means change (which u are doing)
I'm not sure exactly how to do that. That was my first idea to contain it in the config but I had to intercept the call for the title of the newsfeed prior to being pushed to MM and replace characters at that point. Any way you do it I think you're stuck with an edit in the newsfeed file. I am 'no' expert though!!! (:
yes, have to change newsfeed..
but it could look like this in config.js
module:
config: {
newStrings:[] // your array
replaceFunc: function(old_title){
let newtitle=oldtitle
for(let i in newStrings){
newtitle=newtitle.replaceAll(newStrings[i], newStrings[i+1])
}
return newtitle
}
same code u had in newsfeed.js replaceFunc is whatever is the code in mm change
its code is
if(module.config.replaceFunc)
title=module.config.replaceFunc(old_title)
Interesting. Yeah, I worked it a bit to try to minimize changes to the newsfeed, but this reduces it further. Nice Sam.
Mich's idea, i was just commenting on how it might look..
do we need 2 handlers,
one for title, one for contents?
And yes - thanks Mitch - I'm still learning what I can get away with in the config file (:
Yes, two handlers I'd suspect - for me - I'm just displaying titles so that's as far as I went, but I did think about that for the contents. Should be just a duplicate of the process I set up I'd think. Run it through the same array. The description is at the same place as where I intercepted the title.
I actually thought about having another array that was a 'notification' - same principal as what I already did.
So you have an another array that has trigger words - so say the headline has 'Market Drops', or 'Inflation' etc - then you have it display a notification (at the top of MM) that would then display it longer - say 1 minute. I have my headlines at about 15 seconds I think. It seems that would also be a cool feature that would only require slight additions to what I showed.
I hope what I set up (and perhaps modified as you two showed/suggested) works for people.
I would recommend one. Which could be something like:
config: {
// ...
itemTransformer: function (item) {
item.title = item.title.toUpperCase()
item.description = item.description.replace('foo', 'bar')
return item
}
}
(disclaimer: code not tested. Just as an example)
If a method becomes huge, it could always be extracted to an external file.
This same principle could be used for the calendar module to solve the issues like reoccurrence count.
@MichMich yes, cool.. I am loathe to expose object cause you don't know what other trouble you will get into.. but whatever..
The implementation in the module itself is super simple, the default transformer is just:
(item) => item
After all the transformations we should filter out any item that is null
, that way, if a transformer returns null
, the item gets removed. This way a user can use the transformer to filter the array.
@MichMich yes, cool.. I am loathe to expose object cause you don't know what other trouble you will get into.. but whatever..
Oh it will definitely bring users in trouble, but it's an awesome way to learn javascript. And the answer is simple: "your transformer is messed up" :)
Of course it's optional. So it won't bother new users.
It might be smart to do to make it part of the feeds module, or allow a generic itemTransformer, and one per feed. That way you have the option to apply the transformer on specific feeds.
js code in config... that would be the first here, right?
the answer to people who run into this might be simple. but until then they ask in the forum, scratch their head, then get the answer, but then have to fiure out how to correct it... i see more work then less work :-(
yeh, fun times...
i will add the new field (function) to the newsfeed form for MMM-Config..
and we should probably make a pinned forum topic, if we publish this kind of thing.
I think since it's optional people will only know about it when they read the docs. It might be worth adding a big note there: this is for advanced users and needs some self-solving ability. :)
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hi together, somehow, the newsfeed module cannot handle the RSS feed right. I tried different sources but every time the same at my MagicMirror:
Thank you in advance.