A fast & lightweight XML/HTML parser in Swift that makes your life easier. [Documentation]
Fuzi is based on a Swift port of Mattt Thompson's Ono(斧), using most of its low level implementations with moderate class & interface redesign following standard Swift conventions, along with several bug fixes.
Fuzi(斧子) means "axe", in homage to Ono(斧), which in turn is inspired by Nokogiri (鋸), which means "saw".
let xml = "..."
// or
// let xmlData = <some NSData or Data>
do {
let document = try XMLDocument(string: xml)
// or
// let document = try XMLDocument(data: xmlData)
if let root = document.root {
// Accessing all child nodes of root element
for element in root.children {
print("\(element.tag): \(element.attributes)")
}
// Getting child element by tag & accessing attributes
if let length = root.firstChild(tag:"Length", inNamespace: "dc") {
print(length["unit"]) // `unit` attribute
print(length.attributes) // all attributes
}
}
// XPath & CSS queries
for element in document.xpath("//element") {
print("\(element.tag): \(element.attributes)")
}
if let firstLink = document.firstChild(css: "a, link") {
print(firstLink["href"])
}
} catch let error {
print(error)
}
libxml2
String
or NSData
or [CChar]
AnyObject!
that cause unnecessary type castsUse version 0.4.0 for Swift 2.3.
There are 4 ways you can install Fuzi to your project.
You can use CocoaPods to install Fuzi
by adding it to your to your Podfile
:
platform :ios, '8.0'
use_frameworks!
target 'MyApp' do
pod 'Fuzi', '~> 1.0.0'
end
Then, run the following command:
$ pod install
The Swift Package Manager is now built-in with Xcode 11 (currently in beta). You can easily add Fuzi as a dependency by choosing File > Swift Packages > Add Package Dependency...
or in the Swift Packages tab of your project file and clicking on +
.
Simply use https://github.com/cezheng/Fuzi
as repository and Xcode should automatically resolve the current version.
*.swift
files in Fuzi
directory into your project.Build Settings
:
Search Paths
, add $(SDKROOT)/usr/include/libxml2
to Header Search Paths
.Linking
, add -lxml2
to Other Linker Flags
.Create a Cartfile
or Cartfile.private
in the root directory of your project, and add the following line:
github "cezheng/Fuzi" ~> 1.0.0
Run the following command:
$ carthage update
Then do the followings in Xcode:
Fuzi.framework
built by Carthage into your target's General
-> Embedded Binaries
.Build Settings
, find Search Paths
, add $(SDKROOT)/usr/include/libxml2
to Header Search Paths
.import Fuzi
let xml = "..."
do {
// if encoding is omitted, it defaults to NSUTF8StringEncoding
let document = try XMLDocument(string: html, encoding: String.Encoding.utf8)
if let root = document.root {
print(root.tag)
// define a prefix for a namespace
document.definePrefix("atom", defaultNamespace: "http://www.w3.org/2005/Atom")
// get first child element with given tag in namespace(optional)
print(root.firstChild(tag: "title", inNamespace: "atom"))
// iterate through all children
for element in root.children {
print("\(index) \(element.tag): \(element.attributes)")
}
}
// you can also use CSS selector against XMLDocument when you feels it makes sense
} catch let error as XMLError {
switch error {
case .noError: print("wth this should not appear")
case .parserFailure, .invalidData: print(error)
case .libXMLError(let code, let message):
print("libxml error code: \(code), message: \(message)")
}
}
HTMLDocument
is a subclass of XMLDocument
.
import Fuzi
let html = "<html>...</html>"
do {
// if encoding is omitted, it defaults to NSUTF8StringEncoding
let doc = try HTMLDocument(string: html, encoding: String.Encoding.utf8)
// CSS queries
if let elementById = doc.firstChild(css: "#id") {
print(elementById.stringValue)
}
for link in doc.css("a, link") {
print(link.rawXML)
print(link["href"])
}
// XPath queries
if let firstAnchor = doc.firstChild(xpath: "//body/a") {
print(firstAnchor["href"])
}
for script in doc.xpath("//head/script") {
print(script["src"])
}
// Evaluate XPath functions
if let result = doc.eval(xpath: "count(/*/a)") {
print("anchor count : \(result.doubleValue)")
}
// Convenient HTML methods
print(doc.title) // gets <title>'s innerHTML in <head>
print(doc.head) // gets <head> element
print(doc.body) // gets <body> element
} catch let error {
print(error)
}
import Fuzi
let xml = "..."
// Don't show me the errors, just don't crash
if let doc1 = try? XMLDocument(string: xml) {
//...
}
let html = "<html>...</html>"
// I'm sure this won't crash
let doc2 = try! HTMLDocument(string: html)
//...
Not only text nodes, you can specify what types of nodes you would like to access.
let document = ...
// Get all child nodes that are Element nodes, Text nodes, or Comment nodes
document.root?.childNodes(ofTypes: [.Element, .Text, .Comment])
Looking at example programs is the swiftest way to know the difference. The following 2 examples do exactly the same thing.
Ono
[doc firstChildWithTag:tag inNamespace:namespace];
[doc firstChildWithXPath:xpath];
[doc firstChildWithXPath:css];
for (ONOXMLElement *element in parent.children) {
//...
}
[doc childrenWithTag:tag inNamespace:namespace];
Fuzi
doc.firstChild(tag: tag, inNamespace: namespace)
doc.firstChild(xpath: xpath)
doc.firstChild(css: css)
for element in parent.children {
//...
}
doc.children(tag: tag, inNamespace:namespace)
Ono
Conforms to NSFastEnumeration
.
// simply iterating through the results
// mark `__unused` to unused params `idx` and `stop`
[doc enumerateElementsWithXPath:xpath usingBlock:^(ONOXMLElement *element, __unused NSUInteger idx, __unused BOOL *stop) {
NSLog(@"%@", element);
}];
// stop the iteration at second element
[doc enumerateElementsWithXPath:XPath usingBlock:^(ONOXMLElement *element, NSUInteger idx, BOOL *stop) {
*stop = (idx == 1);
}];
// getting element by index
ONOXMLDocument *nthElement = [(NSEnumerator*)[doc CSS:css] allObjects][n];
// total element count
NSUInteger count = [(NSEnumerator*)[document XPath:xpath] allObjects].count;
Fuzi
Conforms to Swift's SequenceType
and Indexable
.
// simply iterating through the results
// no need to write the unused `idx` or `stop` params
for element in doc.xpath(xpath) {
print(element)
}
// stop the iteration at second element
for (index, element) in doc.xpath(xpath).enumerate() {
if idx == 1 {
break
}
}
// getting element by index
if let nthElement = doc.css(css)[n] {
//...
}
// total element count
let count = doc.xpath(xpath).count
Ono
ONOXPathFunctionResult *result = [doc functionResultByEvaluatingXPath:xpath];
result.boolValue; //BOOL
result.numericValue; //double
result.stringValue; //NSString
Fuzi
if let result = doc.eval(xpath: xpath) {
result.boolValue //Bool
result.doubleValue //Double
result.stringValue //String
}
Fuzi
is released under the MIT license. See LICENSE for details.