scinfu / SwiftSoup

SwiftSoup: Pure Swift HTML Parser, with best of DOM, CSS, and jquery (Supports Linux, iOS, Mac, tvOS, watchOS)
https://scinfu.github.io/SwiftSoup/
MIT License
4.52k stars 345 forks source link

UTF-8 view optimizations + Regression fix for #275 #276

Closed aehlke closed 2 months ago

aehlke commented 2 months ago

I haven't benchmarked yet, but in theory this should be faster

Haven't fully tested yet either

Note that this increases the minimum platform versions

aehlke commented 2 months ago

Room for improvement, but it seems MUCH faster now that it operates on utf8view and utf8 indexes

aehlke commented 2 months ago

@scinfu pardon for the interruption, I understand you're very busy - I added a commit here that resolves the regressions I introduced in #275 but those fixes could be split out from here as well

aehlke commented 2 months ago

Sorry for entangling the improvement from the regression fix, but that can be fixed / the regression changes can be extracted from it

I need to do more profiling again and find more room for improvement, but I believe this change to utf8view is a massive improvement to SwiftSoup speed. So I hope we can get this merged too.

I think the next biggest win would likely be to change all functions from operating on String to operating on a view or slice (I forget the correct type names for this) that reflects string data from the source without needing to ever copy that data or rarely, or even arrays of them as needed, which can be turned into string at a price and at the library user's choice.

But maybe this doesn't matter as much if this utf8 change is complete (not sure it does this yet actually) enough to have String copying be fast now because of copying byte data

aehlke commented 2 months ago

@scinfu may I suggest you add more maintainers to the project to be able to review/merge issues with less trouble :)

scinfu commented 2 months ago

@aehlke yes, good idea.

scinfu commented 2 months ago

@aehlke Could you please add a test to prevent this bug in the future?

aehlke commented 2 months ago

@scinfu I added testing for the hang (by adding a timeout waiter) which repros without my fix in this PR. I also fixed the test suite (unrelated broken test)

I will merge once tests pass as a hotfix

aehlke commented 2 months ago

Ignoring the 5.7 failure, all other tests passed

5.7 failure issue tracked here: https://github.com/scinfu/SwiftSoup/issues/282 It's unrelated, an apparently recent Xcode 15 regression

aehlke commented 2 months ago

released: https://github.com/scinfu/SwiftSoup/releases/tag/2.7.4