otiai10 / gosseract

Go package for OCR (Optical Character Recognition), by using Tesseract C++ library
https://pkg.go.dev/github.com/otiai10/gosseract
MIT License
2.74k stars 290 forks source link

[Asahi Linux] undefined: gosseract.NewClient when CGO_ENABLED=0 #311

Open isrealbm opened 5 months ago

isrealbm commented 5 months ago

Summary

go build failed: undefined: gosseract.NewClient when CGO_ENABLED=0

CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build main.go
main.go:10:22: undefined: gosseract.NewClient

If I change CGO_ENABLED to CGO_ENABLED=1 then will get other errors:

CGO_ENABLED=1 GOOS=linux GOARCH=amd64 go build main.go
# runtime/cgo
linux_syscall.c:67:13: error: call to undeclared function 'setresgid'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
linux_syscall.c:67:13: note: did you mean 'setregid'?
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/unistd.h:593:6: note: 'setregid' declared here
linux_syscall.c:73:13: error: call to undeclared function 'setresuid'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
linux_syscall.c:73:13: note: did you mean 'setreuid'?
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/unistd.h:595:6: note: 'setreuid' declared here

Reproducibility

Reproducibility Frequency

Environment

Darwin JMS-Macbook.local 23.5.0 Darwin Kernel Version 23.5.0: Wed May  1 20:14:38 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T6020 arm64
GO111MODULE=''
GOARCH='arm64'
GOBIN=''
GOCACHE='username/Library/Caches/go-build'
GOENV='username/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='.../mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/.../go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.22.4'
GCCGO='gccgo'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='.../go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/95/kq5nqtfj6cl_2mtb0ftxgjdw0000gn/T/go-build1675637437=/tmp/go-build -gno-record-gcc-switches -fno-common'
go version go1.22.4 darwin/arm64
tesseract 5.4.1
 leptonica-1.84.1
  libgif 5.2.1 : libjpeg 8d (libjpeg-turbo 3.0.0) : libpng 1.6.43 : libtiff 4.6.0 : zlib 1.2.12 : libwebp 1.4.0 : libopenjp2 2.5.2
 Found NEON
 Found libarchive 3.7.4 zlib/1.2.12 liblzma/5.4.6 bz2lib/1.0.8 liblz4/1.9.4 libzstd/1.5.6
 Found libcurl/8.6.0 SecureTransport (LibreSSL/3.3.6) zlib/1.2.12 nghttp2/1.61.0
otiai10 commented 5 months ago

This works for me

CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -o out .
beerkeeper commented 4 months ago

Hi,

I'm pretty sure @isrealbm is trying to run this on Asahi Linux on a Macbook with ARM, looking at the GOARCH value (correct me if I'm wrong). Edit: It says Darwin, which I think means Mac OS. Sorry for the mistake.

I can reproduce the issue myself, but I'm not sure if this is related to this library or due to the missing packages on Asahi Linux. Which exact packages are needed for Fedora?

I have the following installed on Ubuntu 24.04 and the library is working great:

libtesseract-dev/noble,now 5.4.1-1ppa1~noble1 amd64 [installed]
libtesseract5/noble,now 5.4.1-1ppa1~noble1 amd64 [installed]
tesseract-ocr-eng/noble,noble,now 1:4.1.0-2 all [installed,automatic]
tesseract-ocr-osd/noble,noble,now 1:4.1.0-2 all [installed,automatic]
tesseract-ocr/noble,now 5.4.1-1ppa1~noble1 amd64 [installed]

On Asahi Linux I have installed the following packages which leads to the error mentioned in the issue:

tesseract.aarch64
tesseract-devel.aarch64
tesseract-langpack-eng.noarch
tesseract-tessdata-doc.noarch
otiai10 commented 2 months ago

Still I don't have an env to test for Asahi Linux, though, for Fedora, this is the working answer https://github.com/otiai10/gosseract/blob/main/test/runtimes/fedora.Dockerfile

otiai10 commented 2 months ago

For Asahi, I'll work on it https://github.com/AsahiLinux/asahi-installer

otiai10 commented 2 months ago

@ayanel-ci test

ericcurtin commented 2 months ago

Could you try and build within a podman-machine Linux VM?

It appears as though this is an attempt to build for Linux on macOS

isrealbm commented 2 months ago

Hello guys, sorry for a busy week. Yes, I have tried to build an application on the ARM64 chip (Apple silicon) to deploy to a VM running with AMD64 CPU. I think the problem happened by this library itself. I tried to install Tesseract directly in that VM and use other lib then everything was work as expected.

Hi,

I'm pretty sure @isrealbm is trying to run this on Asahi Linux on a Macbook with ARM, looking at the GOARCH value (correct me if I'm wrong). Edit: It says Darwin, which I think means Mac OS. Sorry for the mistake.

I can reproduce the issue myself, but I'm not sure if this is related to this library or due to the missing packages on Asahi Linux. Which exact packages are needed for Fedora?

I have the following installed on Ubuntu 24.04 and the library is working great:

libtesseract-dev/noble,now 5.4.1-1ppa1~noble1 amd64 [installed]
libtesseract5/noble,now 5.4.1-1ppa1~noble1 amd64 [installed]
tesseract-ocr-eng/noble,noble,now 1:4.1.0-2 all [installed,automatic]
tesseract-ocr-osd/noble,noble,now 1:4.1.0-2 all [installed,automatic]
tesseract-ocr/noble,now 5.4.1-1ppa1~noble1 amd64 [installed]

On Asahi Linux I have installed the following packages which leads to the error mentioned in the issue:

tesseract.aarch64
tesseract-devel.aarch64
tesseract-langpack-eng.noarch
tesseract-tessdata-doc.noarch
nandesh-dev commented 2 months ago

Hey there! I actually have the similar issue where i cannot compile my project with CGO_ENABLED=0. I am on nix-os linux (amd64). Just to be clear, the projects works completely fine without the above flag but I need to make a statically linked application which is currently not possible due to gosseract giving this error.

I have also testing this with ubuntu:24.04 and alpine base image in docker and both gave the same result.

Working Docker Example With Dynamically Linked Build

FROM alpine:latest

RUN apk add --no-cache \
    go \
    tesseract-ocr \
    tesseract-ocr-dev \
    leptonica-dev \
    g++ \
    tesseract-ocr-data-eng \
    ffmpeg \
    && rm -rf /var/cache/apk/*

WORKDIR /src

COPY go.mod go.sum ./
RUN go mod download && go mod verify

COPY . .
RUN GOOS=linux GOARCH=amd64 go build -o /bin/subtle/subtle ./cmd/subtle

WORKDIR /media

CMD ["/bin/subtle/subtle"]

Non Working Docker Example with Statically Linked Build

FROM alpine:latest

RUN apk add --no-cache \
    go \
    tesseract-ocr \
    tesseract-ocr-dev \
    leptonica-dev \
    g++ \
    tesseract-ocr-data-eng \
    ffmpeg \
    && rm -rf /var/cache/apk/*

WORKDIR /src

COPY go.mod go.sum ./
RUN go mod download && go mod verify

COPY . .
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -o /bin/subtle/subtle ./cmd/subtle

WORKDIR /media

CMD ["/bin/subtle/subtle"]

Error

undefined: gosseract.Client
undefined: gosseract.NewClient

The above error is made during the building of the application with go build and not a runtime error.

Please let me know if there are any more information you would need.

nandesh-dev commented 1 month ago

I have made an error in my previous comment which should be clarified. Library like gosseract which uses native C code requires CGO to be enabled; so you cannot set CGO_ENABLED=0. The reason I tried disabling CGO is to statically build my go project. After some digging around it turns out you have to statically build both tesseract and leptonica in order to link them properly. I have made a PR showcasing the same.