Closed cluelessoodles closed 6 years ago
Your url has a space , trying this url "http://www.sikhnet.com/hukam" source html do not have any element named div.translation-row
Thanks. The space in the url only occured when I copy pasted my code here. It is all 1 line in my test app.
I first tried just .translation-row, then I added div in an attempt to clarify what kind of element it was.
Im still doing it wrong - that much is obvious. Any idea what I might be doing wrong?
Not sure how often you check in with these, but still need help with this. I'm able to get the h1 of the page, but this specific div.
You might get more help on https://stackoverflow.com/ for this type of problem. I don't see any issue with SwiftSoup. The Gurmukhi text does not appear to be Unicode, so yes you will need deal with that. I modified you code as follows:
if let data = response.data, let html = String(data: data, encoding: .utf8) {
do {
let doc: Document = try SwiftSoup.parse(html)
let elements = try doc.getAllElements()
for element in elements {
switch element.tagName() {
case "div" :
print ("div: \(try element.className()) id: \(element.id())")
default:
let _ = 1
}
}
} catch let error {
print(error.localizedDescription)
}
}
This will give you a list of all the div and associated id's. I believe that the text that you are trying to scrape is added dynamically.
@cluelessoodles did you resolve? @BMinas Thank you for support
@scinfu It turned out that the html I was trying to access was wrapped in a script on the webpage so I ended up using WKScriptMessageHandler delegate to get the text.
Closed due to inactivity, if necessary re-open.
I'm trying to use Alamofire and Swiftsoup to display some body text from a website.
The html that I need is in a div with a class and for some reason swiftsoup wont read it.
The html div is < div class="translation-row" > with another < div class="t-english colorblue"> inside and when I try to parse it with Swiftsoup like below, it gives me no text. Is there a special way to parse Ids with swiftsoup? I am able to parse div classes.
My viewcontroller code is:
Another issue I'm having that I'm not sure how to solve. Part of the text is in a non latin font. So do I need to improve a webfont for that text or will Swiftsoup parse it in the characters shown? I havent successfully parsed that div so I dont know if it will show up at all and wanted to ask the correct way to parse html text that was in non latin characters.