prof18 / RSS-Parser

A Kotlin Multiplatform library to parse a RSS Feed
Apache License 2.0
490 stars 126 forks source link

Unexpected token (position:TEXT ���@1:4 in java.io.InputStreamReader@7982e8c) #104

Closed skeie closed 1 year ago

skeie commented 1 year ago

Hey man, really good job with this library, the API is 🔥 !

Describe the bug When I try to parse this URL: https://podkast.nrk.no/program/loerdagsraadet.rss I get Unexpected token (position:TEXT ���@1:4 in java.io.InputStreamReader@7982e8c) any idea on how to solve this?

The link of the RSS Feed https://podkast.nrk.no/program/loerdagsraadet.rss

prof18 commented 1 year ago

Thank you!

The issue is the presence of the BOM char, which isn't something not necessary on UTF-8 😅

Screenshot 2023-01-24 at 22 03 46

I'll check if I can do something during the parsing, but I can't promise anything since this is something that should be fixed by the feed

skeie commented 1 year ago

Ah, good point!

I'm been playing a bit around with this and this might be a very naive way of doing it

val url = URL(channelUri)
                val connection = url.openConnection()
                val inputStream = connection.getInputStream()
                val byteArray = inputStream.readBytes()
                val bom = byteArrayOf(0xEF.toByte(), 0xBB.toByte(), 0xBF.toByte())

                var hasBoom = true;

                for (i in 0..2) {
                    if(byteArray[i] != bom[i]) {
                        hasBoom = false
                        break
                    }
                }

                val contentWithoutBom = if(hasBoom) {
                    byteArray.copyOfRange(3, byteArray.size)
                } else {
                    byteArray
                }

                val contentWithoutBoom = String(contentWithoutBom, Charsets.UTF_8)
                parser.parse(contentWithoutBoom)

I would love to make it more robust if it needs to in order to get it into the library, if you still think it makes sense that the library handles this :)

prof18 commented 1 year ago

I probably found an optimization that fixes this issue and improve the performance as well!

skeie commented 1 year ago

Amazing - thank you!

Do you have any ETA when you will do a new release? :) No rush, just curious!

prof18 commented 1 year ago

In the next two weeks, hopefully! :)