Open mergesort opened 2 weeks ago
I noticed a couple of questionable snippets of code which I think may be related to the bug I'm experiencing, but I may not be familiar enough with the code to understand if these cause the issue or not.
In FaviconURLSession
, both linuxDataTask
and appleDataTask
can recurse infinitely. If a website employs multiple or cyclic meta-refresh redirects, this could lead to unbound recursion.
I think you would need to change the functions to have a shape like this, where there is a count of the current redirects, and a max redirect count, to prevent overflow scenarios.
static func apple/LinuxDataTask(
with url: URL,
checkForMetaRefreshRedirect: Bool = false,
httpHeaders: [String: String?]? = nil,
redirectCount: Int = 0,
maxRedirects: Int = 5
) async throws -> Response {
I also noticed this code in ICOFaviconFinder
, and I want to double-check my assumptions.
Based on these comments
// We couldn't find any image, so let's try the root domain (just in case it's hiding there)
// ie. If we couldn't find the image at "subdomain.google.com/favicon.ico", let's try "google.com/favicon.ico"
Should the code actually use rootURL
like this, rather than faviconURL
?
let baseFaviconUrlData = try await FaviconURLSession.dataTask(
with: rootURL,
checkForMetaRefreshRedirect: self.configuration.checkForMetaRefreshRedirect
).data
Hope this makes sense, and helps!
Thanks for flagging this @mergesort - I'll be looking into this immediately and update you when I have a fix.
I was just wondering @will-lumley, do you happen to have an update on this?
@mergesort Planning on releasing the fix later tonight :) (I'm in Sydney if that helps)
That's amazing timing, thank you so much! (I didn't know that but I hope you like it there. 🙂)
Hey @will-lumley, any update on the new release?
@mergesort - I had unexpected family commitments come out of nowhere this weekend - sorry for not getting back to you sooner.
So I booted up my Ubuntu machine and ran FaviconFinder
with your URLs, but much to my surprise they worked fine. I tried both of them several times but it worked as expected each time.
Regardless, both of your code suggestions are valid to be input - so I've implemented them (copy/pasted them) into the library under the branch bug/recursive-loop
. Can you switch to this branch and tell me if you still experience this issue? If you still are, can you give me the exact specific linux distro/build/etc you're using so I can try and replicate your issue?
Keen to solve this bug :)
I haven't yet been able to figure out the root of the issue, but I continue to experience crashes on Linux for certain websites. I am somewhat confident that the source of the issue is here, in the
fetchFaviconURLs()
(https://github.com/will-lumley/FaviconFinder/blob/b31e0c1f2e69f577abd1d90f7da58d6540c2da60/Sources/FaviconFinder/FaviconFinder.swift#L60-L114) function.What I'm seeing in the logs is something like this.
And then the server crashes shortly after. I think this may be related to a memory issue, possibly because of the way the webpage is constructed, and possibly because of the recursive loop that
fetchFaviconURLs()
contains.I can reproducibly make that website (https://www.caranddriver.com/news/a62595445/2024-tesla-model-3-quieter-more-highway-range-tested) crash, as well as https://www.prevention.com/health/a46107400/types-of-magnesium/#. I may be able to scrounge up a few more links that have failed and caused crashes if you need, but the key is to run this in a Linux environment, such as a Docker image or a server.
Please let me know if there's any other way I can help, this is unfortunately remains a pretty serious issue for me and I'm more than happy to do what I can do to help.