scinfu / SwiftSoup

SwiftSoup: Pure Swift HTML Parser, with best of DOM, CSS, and jquery (Supports Linux, iOS, Mac, tvOS, watchOS)
https://scinfu.github.io/SwiftSoup/
MIT License
4.53k stars 345 forks source link

Convert HTML to Attributed String #127

Closed photos closed 4 years ago

photos commented 5 years ago

Hi,

I'm trying to convert an HTML snippet with styles and links to an attributed string. What parsing approach would you recommend?

So far I am able to identify links, strong, emphasis and combinations of each.

let linkElements = try doc.select("a") let strongElements = try doc.select("strong") let emphasisElements = try doc.select("em")
let strong_emphasisElements = try doc.select("strong").select("em") let strong_linkElements = try doc.select("a").select("strong") let emphasis_linkElements = try doc.select("a").select("em") let strong_emphasis_linkElements = try doc.select("a").select("strong").select("em")

Here is the HTML I am trying to convert:

"<li><strong>Bee Gees</strong> - <a href=\"https://www.youtube.com/watch?v=I_izvAbhExY\" target=\"_blank\"><em>Staying Alive</em></a></li><li><strong>Daft Punk</strong> - <a href=\"https://www.youtube.com/watch?v=yca6UsllwYs\" target=\"_blank\"><strong><em>Around the World</em></strong></a></li><li><strong>Kanye West</strong> - <a href=\"https://www.youtube.com/watch?v=mWtIxc38xNE\" target=\"_blank\"><strong>Flashing Lights</strong></a></li>"

This has phrases with nested emphasis+strong+ahref attributes. I'm sure other people have attempted this before - wondering if there is a recommended approach?

DabbyNdubisi commented 4 years ago

You should be able to use NSAttributedString to achieve this:

let htmlText = <your html text>

let attributedString = NSAttributedString(
    data: htmlText.data(using: .utf8)!,
    options: [.documentType: NSAttributedString.DocumentType.html, .characterEncoding: String.Encoding.utf8.rawValue],
    documentAttributes: nil
)
peterpaulis commented 4 years ago

would be cool to have this in SwiftSoup... as the NSAttributedString.DocumentType.html has some pitfalls

rizwan95 commented 4 years ago

I agree with @peterpaulis. Is there any way that SwiftSoup can do this? @DabbyNdubisi

zhgchgli0718 commented 1 year ago

feel free to check out my works: https://github.com/ZhgChgLi/ZMarkupParser

peterpaulis commented 1 year ago

Hi

You should make a Cocoapod

Nice work

Dňa ne 26. 2. 2023 o 17:30 ZhgChgLi @.***> napísala(a):

feel free to check out my works: https://github.com/ZhgChgLi/ZMarkupParser

— Reply to this email directly, view it on GitHub https://github.com/scinfu/SwiftSoup/issues/127#issuecomment-1445402961, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAO5Y3G5HM6OOOKN6ACAFT3WZOAJHANCNFSM4I7CDHDQ . You are receiving this because you were mentioned.Message ID: @.***>

-- Prajem príjemný deň... Mgr. Peter Paulis

zhgchgli0718 commented 1 year ago

Hi You should make a Cocoapod Nice work Dňa ne 26. 2. 2023 o 17:30 ZhgChgLi @.> napísala(a): feel free to check out my works: https://github.com/ZhgChgLi/ZMarkupParser — Reply to this email directly, view it on GitHub <#127 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAO5Y3G5HM6OOOKN6ACAFT3WZOAJHANCNFSM4I7CDHDQ . You are receiving this because you were mentioned.Message ID: @.> -- Prajem príjemný deň... Mgr. Peter Paulis

yeah, it supports Cocapod:

target 'MyApp' do
  pod 'ZMarkupParser', '~> 1.2.5'
end