fleeksoft / ksoup

Ksoup is a Kotlin Multiplatform library for working with HTML and XML. It's a port of the renowned Java library Jsoup.
https://fleeksoft.github.io/ksoup/
Apache License 2.0
236 stars 9 forks source link

ktor2 extension module new HttpResponse.bodyAsSourceReader() method #54

Closed vanniktech closed 1 week ago

vanniktech commented 3 weeks ago

So in the ktor2 extension module there are a few helper methods but the one I'm missing is:

io.ktor.client.statement.HttpResponse.bodyAsSourceReader(): com.fleeksoft.ksoup.io.SourceReader

I want to do all of the request handling myself but then when I get the response convert it to a SourceReader so that I can use Ksoup to parse it.

I've already tried to do this myself with:

@file:OptIn(InternalAPI::class)

private suspend fun sourceReader(response: HttpResponse) =
    SourceReaderImpl(response.bodyAsChannel().readBuffer)

But I'm not getting past the error that InternalAPI has:

@RequiresOptIn(
    level = RequiresOptIn.Level.ERROR,
    message = "This API is internal in Ktor and should not be used. It could be removed or changed without notice."
)

which causes my build to fail. Are you using an older version of ktor or how are you getting past this restriction?

itboy87 commented 3 weeks ago

@vanniktech I don't ktor2 have any response.bodyAsChannel().readBuffer that is from ktor3

vanniktech commented 3 weeks ago

Ktor does have a bodyAsChannel().readBuffer but it is internal.

itboy87 commented 3 weeks ago

Ktor does have a bodyAsChannel().readBuffer but it is internal.

@vanniktech I double checked, and Ktor 2 doesn't have the readBuffer field, it's only available in Ktor 3. Could you please provide me with a reference where is that in ktor 2?

vanniktech commented 3 weeks ago

I see it in my IDE:

Screenshot 2024-08-26 at 22 15 30

But not in the GitHub file

Hint: My IDE is via a different dependency in a different module jumping to ktor3 sources, even though the module that I am in has no dependency to ktor3 :-( (I realized this just now.)

It should still be possible to have an extension method on HttpResponse to convert it into a SourceReader, right? ByteReadChannel exposes a bunch of different read methods.


I also know why I got confused now. I also saw you using it here:

Screenshot 2024-08-26 at 22 18 32

and for some reason I thought com.fleeksoft.ksoup.network is the shared module, but it is not:

Screenshot 2024-08-26 at 22 21 12

I got confused that the ktor3 modules have no suffix, where as the ktor2 modules to have suffixes. Ideally, also the ktor3 modules would have their suffix; so that the name is explicit and there's parity with the properly named korlibs modules.

itboy87 commented 2 weeks ago

@vanniktech Ktor2 is in a different branch called ‘ktor2’. I tried to use ReadByteChannel to create a SourceReader, but the main problem is that all of its functions are suspended, which broke a lot of the library code. Under the hood, it uses some kind of memory classes that are not easy to work with.

vanniktech commented 2 weeks ago

Hmm then never mind, I heard somewhere that ktor3 is close to cutting a RC1 version so it should be stable soon.

itboy87 commented 2 weeks ago

@vanniktech  I’m closing this as I don’t have any plans to implement SourceReader for ByteChannel for Ktor 2. I suggest upgrading to Ktor 3

vanniktech commented 2 weeks ago

Could you provide a HttpResponse.bodyAsSourceReader() for ktor3? I don't think that exists there.

itboy87 commented 2 weeks ago

Its not extension function yet but you can do it like

SourceReader.from(httpResponse.bodyAsChannel().readBuffer)

itboy87 commented 2 weeks ago

Hi @vanniktech Sorry for the confusion. In the upcoming 0.1.6 version, you can use SourceReader.from(httpResponse.bodyAsChannel().readBuffer). In 0.1.6-alpha01, you can achieve this using SourceReaderImpl(httpResponse.bodyAsChannel().readBuffer). However, in 0.1.6, SourceReaderImpl will become internal, and you'll need to use SourceReader.from. But adding ext func HttpResponse .bodyAsSourceReader is also goo d idea i will add that too.

vanniktech commented 2 weeks ago

But adding ext func HttpResponse .bodyAsSourceReader is also goo d idea i will add that too.

Amazing, thanks!

itboy87 commented 1 week ago

HttpResponse.asSourceReader() added in version 0.1.6. Thanks