advancedlogic / GoOse

Html Content / Article Extractor in Golang
Apache License 2.0
436 stars 111 forks source link

Add initialization of go-charset #42

Open bobuhiro11 opened 7 years ago

bobuhiro11 commented 7 years ago

The init function defined in the go-charset package must be called. The following code causes a related bug.

$ cat test.go
package main

import (
        "github.com/advancedlogic/GoOse"
)

func main() {
        g := goose.New()
        url := "http://blog.livedoor.jp/unahide/archives/52966628.html"
        article, _ := g.ExtractFromURL(url)
        println("title", article.Title)
        println("description", article.MetaDescription)
        println("keywords", article.MetaKeywords)
        println("content", article.CleanedText)
        println("url", article.FinalURL)
        println("top image", article.TopImage)
}
$ go run test.go
charset: cannot open "charsets.json": open /usr/local/lib/go-charset/datafiles/charsets.json: no such file or directory