Open simonkaspersen opened 8 years ago
0/2 SOURCE>> http://www.nrk.no/nyheter/toppsaker.rss
[0] b'Eksperter om presset H\xc3\xb8gmo: \xe2\x80\x93 Tenker som man gj\xc3\xb8r i et klubblag' [1] b'12 \xc3\xa5rs forvaring for overgrep mot eget spedbarn' [2] b'R\xc3\xb8ysta mot eige parti \xe2\x80\x93 blir skvisa ut' [3] b'Ny rapport om verdenshavene: WWF-Jensen dypt bekymret' [4] b'St\xc3\xb8re til Erna: \xe2\x80\x93 Velkommen tilbake fra bokseringen' [5] b'Fikk Stortinget til \xc3\xa5 le' [6] b'Si nei til voksne som vil ta en Skamtale' [7] b'Ruud fekk bank' [8] b'Turg\xc3\xa5ar fall om \xe2\x80\x93 mangla mobildekning' [9] b'S\xc3\xa5 vidt over 2.500 asyls\xc3\xb8kere hittil i \xc3\xa5r' [10] b'Fikk Nobelprisen i kjemi for verdens minste maskiner' [11] b'\xe2\x80\x93 Feriene blir billigere' [12] b'D\xc3\xb8mt til fengsel for kakekasting' [13] b'Gikk 635 h\xc3\xb8ydemeter for det perfekte bildet' [14] b'\xe2\x80\x93 Jeg forst\xc3\xa5r ikke kunst' [15] b'Lover millionst\xc3\xb8tte til s\xc3\xb8rsamisk senter' [16] b'Cuba svekket orkanen Matthew' [17] b'Klimaekspert: \xe2\x80\x93 Det henger ikke p\xc3\xa5 greip' [18] b'Helleland p\xc3\xa5 utenlandsreise:\xc2\xa0 \xe2\x80\x93 Borte den viktigste dagen i \xc3\xa5ret' [19] b'Gjekk fr\xc3\xa5 ein million' [20] b'\xe2\x80\x93 Det er en sterk september' [21] b'NRKs partibarometer for oktober: Frp faller stygt \xe2\x80\x93 skylder p\xc3\xa5 Ap' [22] b'Her har skuleborn blitt sjuke i mange \xc3\xa5r' [23] b'M\xc3\xa5tte sjekke Pok\xc3\xa9mon' [24] b'Har levd lenge med ADHD uten \xc3\xa5 vite det' [25] b'\xc2\xabNorge er dopet p\xc3\xa5 olje\xc2\xbb' [26] b'Minst 11 d\xc3\xb8de' [27] b'Hele \xc2\xabHakkebakkeskogen\xc2\xbb dukket opp' [28] b'Visepresident-duellen: Dette l\xc3\xb8y de om' [29] b'Hegerberg vil ha Nordlie' [30] b'Han tjener mest av fylkesordf\xc3\xb8rerne' [31] b'Vil gi 11 mill. mer for \xc3\xa5 lokke filmbransjen' [32] b'Vil gi smilefjes til norske sykehus' [33] b'Visepresident-duellen: Beskyldte Trump for manglende patriotisme' [34] b'\xe2\x80\x93 Trump har betalt millioner i skatt' [35] b'\xe2\x80\x93 Pence vant p\xc3\xa5 stil, Kaine p\xc3\xa5 innhold' [36] b'Tvitret heftig under debatten' [37] b'\xe2\x80\x93 Det var d\xc3\xa5 det skjedde noko med namnet mitt' [38] b'Har du peiling p\xc3\xa5 stadnamn?' [39] b'Kristen skole fjerner sider\xc2\xa0om pubertet' [40] b'\xe2\x80\x93 Intervjuet\xc2\xa0blir bare absurd' [41] b'I utlandet n\xc3\xa5r budsjettet blir lagt fram' [42] b'Idrettsstyrets mindretall vil legge kortene p\xc3\xa5 bordet: \xe2\x80\x93 Skaper mistanke om mislighold' [43] b'\xe2\x80\x93 Er eg skyldig i klimaendringane?'
hmm, not sure why this happens. Initially each feed is encoded as utf-8. I tried that link (on macOS and linux) with no issues. What you mean by encoding it again ?
Weird. I did google a little to check it it was some encoding issue with Python, or with feedparser, but all I saw, was that if the RSS already was encoded to UTF-8, a .decode(‘utf8’) would decode it to ISO-something, before it decodes it to UTF-8 again. I don’t know :P
I will try some more, because i like the concept :D
- okt. 2016 kl. 20.46 skrev Aziz Alto notifications@github.com:
hmm, not sure why this happens. Initially each feed is encoded as utf-8 https://github.com/iamaziz/TermFeed/blob/master/termfeed/feed.py#L79. I tried that link (on macOS and linux) with no issues. What you mean by encoding it again ?
https://cloud.githubusercontent.com/assets/3298308/19165000/1ca31c6c-8bd0-11e6-959f-c2d9d6087c69.png — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/iamaziz/TermFeed/issues/7#issuecomment-252053623, or mute the thread https://github.com/notifications/unsubscribe-auth/AGCFzt5lk3NFib3dNsPzT8ZV4MvRhUMzks5qxUIPgaJpZM4KOyRQ.
My feeds are encoded in UTF-8, but when I get it printed in the Command Line the letters ÆØÅ and more are replaced with \xc3\xb8 \xc3\xa5 and more.
Does this happend when you encode it again to UTF-8?