tid-kijyun / Kanna

Kanna(鉋) is an XML/HTML parser for Swift.
MIT License
2.42k stars 221 forks source link
html-parser swift xml-parser

Kanna(鉋)

Kanna(鉋) is an XML/HTML parser for cross-platform(macOS, iOS, tvOS, watchOS and Linux!).

It was inspired by Nokogiri(鋸).

CI Platform Cocoapod Carthage compatible Swift Package Manager MIT License

:information_source: Documentation

Features

Installation for Swift 5

CocoaPods

Add the following to your Podfile:

use_frameworks!
pod 'Kanna', '~> 5.2.2'

Carthage

Add the following to your Cartfile:

github "tid-kijyun/Kanna" ~> 5.2.2

For xcode 11.3 and earlier, the following settings are required.

  1. In the project settings add $(SDKROOT)/usr/include/libxml2 to the "header search paths" field

Swift Package Manager

  1. Installing libxml2 to your computer:
// macOS: For xcode 11.3 and earlier, the following settings are required.
$ brew install libxml2
$ brew link --force libxml2

// Linux(Ubuntu):
$ sudo apt-get install libxml2-dev
  1. Add the following to your Package.swift:
// swift-tools-version:5.0
import PackageDescription

let package = Package(
    name: "YourProject",
    dependencies: [
        .package(url: "https://github.com/tid-kijyun/Kanna.git", from: "5.2.2"),
    ],
    targets: [
        .target(
            name: "YourTarget",
            dependencies: ["Kanna"]),
    ]
)
$ swift build

Note: When a build error occurs, please try run the following command:

// Linux(Ubuntu)
$ sudo apt-get install pkg-config

Manual Installation

  1. Add these files to your project:
    Kanna.swift
    CSS.swift
    libxmlHTMLDocument.swift
    libxmlHTMLNode.swift
    libxmlParserOption.swift
    Modules
  2. In the target settings add $(SDKROOT)/usr/include/libxml2 to the Search Paths > Header Search Paths field
  3. In the target settings add $(SRCROOT)/Modules to the Swift Compiler - Search Paths > Import Paths field

Installation for swift 4

Installation for swift 3

Synopsis

import Kanna

let html = "<html>...</html>"

if let doc = try? HTML(html: html, encoding: .utf8) {
    print(doc.title)

    // Search for nodes by CSS
    for link in doc.css("a, link") {
        print(link.text)
        print(link["href"])
    }

    // Search for nodes by XPath
    for link in doc.xpath("//a | //link") {
        print(link.text)
        print(link["href"])
    }
}
let xml = "..."
if let doc = try? Kanna.XML(xml: xml, encoding: .utf8) {
    let namespaces = [
                    "o":  "urn:schemas-microsoft-com:office:office",
                    "ss": "urn:schemas-microsoft-com:office:spreadsheet"
                ]
    if let author = doc.at_xpath("//o:Author", namespaces: namespaces) {
        print(author.text)
    }
}

Donation

If you like Kanna, please donate via GitHub sponsors or PayPal.
It is used to improve and maintain the library.

License

The MIT License. See the LICENSE file for more information.