will-lumley / FaviconFinder

A small swift library for iOS & macOS to detect favicons used by a website.
MIT License
140 stars 32 forks source link

FaviconFinder doesn't parse relative links #78

Closed mergesort closed 2 weeks ago

mergesort commented 2 months ago

I host my app's landing page using the website builder Carrd, mostly for simplicity's sake. This is a screenshot of how Carrd defines the HTML metadata it uses to display favicons.

Link Metadata

I'm trying to fetch the favicons for my website, using this code.

let faviconFinder = FaviconFinder(url: URL(string: "https://plinky.app")!)
let favicons = try await faviconFinder.fetchFaviconURLs()

When I print the result, it seems that the source URL is assembled in an odd manner that I wouldn't expect.

(lldb) po favicons
▿ 2 elements
  ▿ 0 : FaviconURL
    ▿ source : assets/images/favicon.png?v=b32e5967 -- https://plinky.app
      - _url : assets/images/favicon.png?v=b32e5967 -- https://plinky.app
    - format : FaviconFinder.FaviconFormatType.icon
    - sourceType : FaviconFinder.FaviconSourceType.html
    ▿ sizeTag : Optional<String>
      - some : ""
  ▿ 1 : FaviconURL
    ▿ source : assets/images/apple-touch-icon.png?v=b32e5967 -- https://plinky.app
      - _url : assets/images/apple-touch-icon.png?v=b32e5967 -- https://plinky.app
    - format : FaviconFinder.FaviconFormatType.appleTouchIcon
    - sourceType : FaviconFinder.FaviconSourceType.html
    ▿ sizeTag : Optional<String>
      - some : ""

As you can see it is assembling the relative path with a -- in between that and the root domain. Given this markup

<link rel="canonical" href="https://www.plinky.app">
<link rel="icon" type="image/png" href="assets/images/favicon.png?v=b32e5967">
<link rel="apple-touch-icon" href="assets/images/apple-touch-icon.png?v=b32e5967">

I would instead expect to see a URL like https://plinky.app/assets/images/favicon.png?v=b32e5967. Because of the unexpected format the favicon is never properly parsed or saved to my database.

I can't change the website's markup, but this seems valid to me. Would it be possible to update the library to account for this type of markup?

Thanks a lot!

will-lumley commented 2 weeks ago

Hey @mergesort! Thanks for providing code & examples here.

The printout of the URL that you're seeing is just how Apple's implementation of CustomStringConvertible for URL displays the data when printed out to console. The actual data is correctly stored correctly, and you can get this by using URLs absoluteString or absoluteURL property.

I used the below code in my macOS example app:

let faviconURLs = try await FaviconFinder(
            url: url,
            configuration: .init(
                preferredSource: .html,
                preferences: [
                    .html: FaviconFormatType.appleTouchIcon.rawValue,
                    .ico: "favicon.ico"
                ]
            )
        )
            .fetchFaviconURLs()

        print("FaviconURLs:")
        for faviconURL in faviconURLs {
            print("\(faviconURL.source.absoluteURL)\n")
        }

And this is what was printed out:

FaviconURLs:
http://plinky.app/assets/images/favicon.png?v=17669005
http://plinky.app/assets/images/apple-touch-icon.png?v=17669005

I'll close this issue now - but if you still experience this issue please feel free to comment on this thread here.