maxruby / OpenCV.jl

The OpenCV (C++) interface for Julia
Other
104 stars 28 forks source link

OpenCV.jl

The OpenCV (C++) interface for Julia.


OpenCV.jl aims to provide an interface for OpenCV computer vision applications (C++) directly in Julia . It relies primarily on Cxx.jl, the Julia C++ foreign function interface (FFI). OpenCV.jl comes bundled with the Qt framework —though not essential, it supports many convenient GUI functions. The package also contains thin wrappers for common C++ classes (e.g., std::vector, std::string) to make the C++/Julia interface smoother.


The OpenCV API is described here. OpenCV.jl is organized along the following modules:

Currently, OpenCV.jl has julia wrappers for the core, imgproc, videoio, highgui and video modules. Work is ongoing to wrap the rest of the modules including advanced object detection and tracking algorithms. (Most OpenCV C++ functions are already supported in OpenCV.jl by using @cxx calls directly to C++, with some caveats).

OpenCV.jl has OpenCL support for GPU image processing. This has been made easier recently by a smooth and transparent interface (T-API). GPU-supported code can display improvements in processing speed up to 30 fold. This is invaluable for supporting real-time applications in Julia. See section below on how to implement GPU-enabled code in OpenCV.jl.

The OpenCV API is extensively documented - rather than repeating the entire documentation here, the primary focus is on implementation of image processing and computer vision algorithms to suport Julia applications.

Installation

Install julia 0.6.0 and Cxx.jl according to the following instructions. For Mac OSX, you can use the pre-compiled shared libraries (.dylib) and headers (.hpp) included in OpenCV.jl. However, you can also compile OpenCV from source with the instructions below.

Note that successfully building julia 0.6.0 may require upstream updates/fixes. Currently, on MacOS Sierra 10.12.3, I had to do the following:

override LLVM_VER=3.9.0
override BUILD_LLVM_CLANG=1
override USE_LLVM_SHLIB=1
# Optional, but recommended
override LLVM_ASSERTIONS=1

OSX

To compile OpenCV 3.2.0 (beta) on a 64-bit OSX system

# Clone OpenCV from GitHub master branch  #v0.3-beta
$ git clone https://github.com/Itseez/opencv.git opencv
$ git remote -v

# Create a build directory
$ mkdir build
$ cd build

# Install OpenCV >3.0 (master) *without CUDA*
BASIC INSTALLATION
$ cmake "Unix Makefile" -D CMAKE_PREFIX_PATH="/Users/Max/Qt/5.7/clang_64" -D WITH_OPENGL=ON -D CMAKE_OSX_ARCHITECTURES=x86_64 -D BUILD_PERF_TESTS=OFF -D BUILD_TESTS=OFF -D WITH_CUDA=OFF -D CMAKE_CXX_FLAGS="-std=c++11 -stdlib=libc++" -D CMAKE_EXE_LINKER_FLAGS="-std=c++11 -stdlib=libc++" -D TBB_INCLUDE_DIR="/usr/local/Cellar/tbb/4.3-20141023/include/tbb" -D TBB_LIB_DIR="/usr/local/Cellar/tbb/4.3-20141023/lib" -D WITH_TBB=ON -D WITH_EIGEN=ON -D WITH_QT=OFF -D WITH_OPENEXR=OFF ..

$ make -j4
$ sudo make install

# Confirm installation of OpenCV shared libraries
$ pkg-config --libs opencv

# Confirm directory of OpenCV header files (.hpp)
$ cd /usr/local/include
$ ls opencv2

Linux (Ubuntu)

Download and run OpenCV.jl

Pkg.clone("git://github.com/maxruby/OpenCV.jl.git")
using OpenCV

Basic interface

OpenCV contains hundreds of algorithms and functions. Most frequently used functions for image processing are already accessible in the current version of OpenCV.jl. For simplicity, here I focus on using functions wrapped in OpenCV.jl.

Basic structures

Points (Int, Float)

cvPoint(10, 10)           # x, y
cvPoint2f(20.15, 30.55)
cvPoint2d(40.564, 12.444)

Size and Scalar vectors (Int, Float)

cvSize(300, 300)          # e.g., image width, height
cvSize2f(100.5, 110.6)
cvScalar(255,0,0)         # e.g., [B, G, R] color vector

Ranges

range = cvRange(1,100)   # e.g., row 1 to 100

Rectangle and rotated rectangle

cvRect(5,5,300,300)      # x, y, width, height
# 300x300 rect, centered at (10.5, 10.5) rotated by 0.5 rad
cvRotatedRect(cvPoint2f(10.5, 10.5), cvSize2f(300,300), 0.5)

Creating, copying and converting images

Mat array/image constructors: rows (height), columns (width)

img0 = Mat()                             # empty
img1 = Mat(600, 600, CV_8UC1)            # 600x600 Uint8 gray

imgSize = cvSize(500, 250)    
img2 = Mat(imgSize, CV_8UC1)             # 500x250 Uint8 gray

imgColor = cvScalar(255, 0, 0)   
img3 = Mat(600, 600, CV_8UC3, imgColor)  # 600x600 Uint8 RGB (blue)

Create a region of interest (ROI)

const roi = cvRect(25, 25, 100, 100);     # create a ROI
img4 = Mat(img3, roi)

Initialize arrays with zeros or ones

zerosM(300,300, CV_8UC3)      # RGB filled with zeros
zerosM(imgSize, CV_8UC1)      # Gray filled with zeros  
ones(300,300, CV_8UC3)        # RGB filled with ones    
const sz = pointer([cint(5)]); # pointer to size of each dimension
ones(2, sz, CV_8UC3)          # 2 x sz        

Create an identity matrix

eye(300,300, CV_8UC3)         # 300x300 Uint8 (RGB)

Clone, copy, convert, basic resizing

img2 = clone(img1);
copy(img1, img2);
alpha=1; beta=0;  # scale and delta factors
convert(img1, img2, CV_8UC3, alpha, beta)
resizeMat(img1, 100, cvScalar(255,0, 0)) # 100 rows, 100 x 100

Operations on image arrays

Addition and substraction

img1 = Mat(300, 300, CV_8UC3, cvScalar(255, 0, 0));
img2 = Mat(300, 300, CV_8UC3, cvScalar(0, 0, 255));
img3 = imadd(img1, img2)
img4 = imsubstract(img1, img2)

Matrix multiplication

alpha = 1; # weight of the matrix
beta = 0;  # weight of delta matrix (optional)
flag = 0;  # GEMM_1_T  (transpose m1, m2 or m3)
m1 = ones(3, 3, CV_32F);    #Float32 image
m2 = ones(3, 3, CV_32F);
m3 = zerosM(3, 3, CV_32F);
gemm(m1, m2, alpha, Mat(), beta, m3, flag)

Accessing pixels and indexing Mat arrays
Image pixels in Mat containers are arranged in a row-major order.
For a grayscale image, e.g., pixels are addressed by row, col

col 0 col 1 col 2 col 3 col m
row 0 0,0 0,1 0,2 0,3 0,m
row 1 1,0 1,1 1,2 1,3 1,m
row 2 2,0 2,1 2,2 2,3 2,m
row n n,0 n,1 n,2 n,3 n,m

For RGB color images, each column has 3 values (actually BGR in Mat)

col 0 col 1 col 2 col m
row 0 0,0, 0,0 0,0 0,1 0,1 0,1 0,2 0,2 0,2 0,m 0,m 0,m
row 1 1,0 1,0 1,0 1,1 1,1 1,1 1,2 1,2 1,2 1,m 1,m 1,m
row 2 2,0 2,0 2,0 2,1 2,1 2,1 2,2 2,2 2,2 2,m 2,m 2,m
row n n,0 n,0 n,0 n,1 n,1 n,1 n,2 n,2 n,2 n,m n,m n,m

Getting and setting selected pixel values
Method 1: Access pixel values using pixget and pixset functions. Here we use theMat::atclass method - slow but safe, intended only for checking and setting small numbers of pixels (not for scanning through the entire image). To illustrate we draw random red pixels on a blue image (i.e., turn them yellow).

# Creat a blue image
img = Mat(300, 300, CV_8UC3, cvScalar(255, 0, 0));  
# get value for (row1,col1)
pixget(img, 1, 1)  
# create a C++ std::vector (BGR: Red) from a Julia vector
red = tostdvec([float(0), float(0), 255.0])
# turn random pixels yellow
for i=1:1000
    pixset(img, Int(round(rand()*rows(img))), Int(round(rand()*cols(img))), red)  
end
# Display (see description for these functions below)
imdisplay(img, "Random art")
closeWindows(0,27,"") # close by pressing ESC

Method 2: Efficient pixel scanning and manipulation using pointers in C++. Functions setgray and setcolor can be used to scan an entire image and replace pixel values. For example, scanning & exchanging the BGR values for all pixels in a 1000x1000 image took approx. 16 ms. Such functions should be modified and optimized for each operation/algorithm.

# Creat a green image
img = Mat(1000, 1000, CV_8UC3, cvScalar(0, 255, 0));
color = tostdvec([cint(255), cint(55), cint(0)]) # fuchsia
setcolor(img, color)  
imdisplay(img, "coloring the fast way")
closeWindows(0,27,"")

Opening and saving images

Read and write with full path/name

filename = joinpath(Pkg.dir("OpenCV"), "./test/images/lena.png");
img = imread(filename)
imwrite(joinpath(homedir(), "lena_copy.png"), img)

Alternatively, open and save files with Qt dialog interface

img = imread()
imwrite(img)

Open image withImages.jl and convert to OpenCV Mat
Here we convert a binary image loaded with Images to a Mat image array

using Color, FixedPointNumbers
import Images, ImageView
using OpenCV

filename = joinpath(Pkg.dir("OpenCV"), "./test/images/lena.jpeg")
image = Images.imread(filename)  # load with Images.jl
converted = convertToMat(image);
ImageView.view(image)
imdisplay(converted, "converted to OpenCV Mat")
closeWindows(0,27,"")

Access image properties

printMat(img)           # crude printout of the entire Mat (uchar only)
total(img)              # number of array elements
dims(img)               # dimensions
size(img)               # cvSize(columns, rows)
rows(img)               # rows
cols(img)               # columns
isContinuous(img)       # is stored continuously (no gaps)?
elemSize(img)           # element size in bytes (size_t)
cvtypeval(img)          # Mat type identifier (number)
cvtypelabel(img)        # Mat type label (e.g., CV_8UC1)
depth(img)              # element depth
channels(img)           # number of matrix channels
empty(img)              # is array is empty? (true/false)
ptr(img, 10)            # uchar* or typed pointer for matrix row

Basic image display (GUIs)

# original highgui functions
namedWindow("Lena", WINDOW_AUTOSIZE)
imshow("Lena", img)
moveWindow("Lena", 200, 200)
resizeWindow("Lena", 250, 250)
closeWindows(0,27,"Lena")   # waits until ESC key(27) press to close "Lena"

# custom display functions
imdisplay(img, "Lena")  # optional: window resizing, key press, time
im2tile(imArray, "Tiled images")  # => closeWindows

Image processing

Resize images

dst = clone(img)
resize(img, dst, cvSize(250,250), float(0), float(0), INTER_LINEAR)
imdisplay(img, "Lena")
imdisplay(dst, "Resized Lena")
closeWindows(0,27,"")  # waits for ESC to close all windows

interpolation options:
# INTER_NEAREST - a nearest-neighbor interpolation
# INTER_LINEAR - a bilinear interpolation (used by default)
# INTER_AREA - resampling using pixel area relation
# INTER_CUBIC - a bicubic interpolation over 4x4 pixel neighborhood
# NTER_LANCZOS4 - a Lanczos interpolation over 8x8 pixel neighborhood

Select ROI and copy to another image

filename = joinpath(Pkg.dir("OpenCV"), "./test/images/lena.png")
src = imread(filename)
dst = Mat(cint(rows(src)) + 100, cint(cols(src)) + 100, CV_8UC3, cvScalar(0, 255, 255))
roi = cvRect(Int64(10),Int64(10), Int64(cols(src)), Int64(rows(src)))
final = imreplace(src, dst, roi)  
namedWindow("original", 256)
namedWindow("replace", 256)
imshow("original", src)
imshow("replace", final)
closeWindows(0,27,"")

Change color format

dst = Mat()
cvtColor(img, dst, COLOR_BGR2GRAY)

Blur with a normalized box filter

blurred = clone(img)
blur(img, blurred, cvSize(5,5))
imdisplay(blurred, "Box filter")
closeWindows(0,27,"")

Blur with a Gaussian filter, 5x5 kernel

gaussianBlur(img, dst, cvSize(5,5))
im2tile([img, dst], "Gaussian 5x5")
closeWindows(0,27,"")

Binary thresholding

cvtColor(img, dst, COLOR_BGR2GRAY)
src = clone(dst)
threshold(src, dst, 120, 255, THRESH_BINARY)  # thresh = 0, max = 255
# other methods can be invoked with e.g., #THRESH_OTSU, THRESH_BINARY_INV flags
imdisplay(img, "Original")
imdisplay(dst, "Thresholded")
closeWindows(0,27, "")

Convolution

kernel = ones(5,5,CV_32F)
normkernel = normalizeKernel(ones(7,7,CV_32F), getKernelSum(kernel))
filter2D(img, dst, -1, normkernel)
im2tile([img, dst], "Convolution 7x7")
closeWindows(0,27,"")

Laplacian filter

laplacian(img, dst, -1, 5)          # second-derivative aperture = 5
im2tile([img, dst], "laplacian")  
closeWindows(0,27,"")  

Sobel operator (edge detection)

sobel(img, dst, -1, 1, 1, 3)        # dx = 1, dy = 1, kernel = 3x3
im2tile([img, dst], "sobel")
closeWindows(0,27,"")

Canny edge detection

filename = joinpath(Pkg.dir("OpenCV"), "./test/images/lena.png")
img = imread(filename)
edges = Mat()
threshold1 = 125.0; threshold2 = 350.0
apertureSize = 3; L2gradient = false
Canny(img, edges, threshold1, threshold2, apertureSize, L2gradient)
imdisplay(edges, "canny")
closeWindows(0,27,"")

Image overlay (linear blending)

filename2 = joinpath(Pkg.dir("OpenCV"), "./test/images/mandrill.jpg")
img2 = imread(filename2)
dst = Mat()
alpha = 0.5; beta = 0.2; gamma = 0.6
addWeighted(img, alpha, img2, beta, gamma, dst)
imdisplay(dst, "overlay")
closeWindows(0,27,"")

Image sharpening

filename = joinpath(Pkg.dir("OpenCV"), "./test/images/lena.png")
img = imread(filename)
dst = Mat()
sharpened = Mat()
gaussianBlur(img, dst, cvSize(0, 0), 0.2)
addWeighted(img, 1.5, dst, -0.3, float(0), sharpened)
im2tile([img, dst], "sharpened")
closeWindows(0,27,"")

Video acquistion, streaming and writing

Basic video stream display from default camera. All GUI classes/functions (e.g., videoCapture) can be easily called from OpenCV.jl to build new custom video acquisition functions.

videocam()     # press ESC to stop  

The following identifiers can be used (depending on backend) to get/set video properties:

append "CAP_PROP_" to id below
POS_MSEC       Current position of the video file (msec or timestamp)  
POS_FRAMES     0-based index of the frame to be decoded/captured next
POS_AVI_RATIO  Relative position of the video file: 0 - start of the film, 1 - end of the film
FRAME_WIDTH    Width of the frames in the video stream
FRAME_HEIGHT   Height of the frames in the video stream
FPS            frame rate
FOURCC         4-character code of codec
FRAME_COUNT    Number of frames in the video file
FORMAT         Format of the Mat objects returned by retrieve()
MODE           Backend-specific value indicating the current capture mode
BRIGHTNESS     Brightness of the image (only for cameras)
CONTRAST       Contrast of the image (only for cameras)
SATURATION     Saturation of the image (only for cameras)
HUE            Hue of the image (only for cameras)
GAIN           Gain of the image (only for cameras)
EXPOSURE       Exposure (only for cameras)
CONVERT_RGB    Boolean flags indicating whether images should be converted to RGB
WHITE_BALANCE  Currently not supported
RECTIFICATION  Rectification flag for stereo cameras (note: only supported by DC1394 v 2.x backend currently)

To get video properties, use getVideoId

cam = videoCapture(CAP_ANY)   # cv::VideoCapture
getVideoId(cam, CAP_PROP_FOURCC)   # or set to -1 (uncompressed AVI)

To set video properties, use setVideoId

setVideoId(cam, CAP_PROP_FPS, 10.0)

Close the camera input

release(cam)

Stream videos from the web (requires http link to source file)

vid = "http://devimages.apple.com/iphone/samples/bipbop/bipbopall.m3u8"
webstream(vid)

Write the video stream to disk

cam = videoCapture(vid)
filename = joinpath(homedir(), "myvid.avi")
fps = 25.0
nframes = 250            # default -> nframes = 0, to stop press ESC
frameSize=cvSize(0,0)    # input = output frame size
codec = -1               # fourcc(CV_FOURCC_IYUV)
isColor = true           # color
device = CAP_ANY         # default device
videoWrite (cam, filename, fps, nframes, frameSize, codec, true)

Interactive image processing

Videoprocessor is a basic example of a custom C++ class I wrote to support interactive image processing and display with OpenCV In Julia. It may be useful for testing custom C++ image processing algorithms. It accepts single image or video files and video streams. The basic concept is to create a class for each image processing operation in Videoprocessor (e.g., Thresholding). Currently it supports, brightness, contrast and simple thresholding filters. You can retrieve the final values for each of the filter operations as shown below. For more details, see src/Videoprocessor.jl.

# single image file
filename = joinpath(Pkg.dir("OpenCV"), "./test/images/lena.png")
processes = stdvec(cint(0),cint(0))
stdpush!(processes, BRIGHTNESS)# or CONTRAST/THRESHOLD
params = videoprocessor(processes, "Demo", filename,-1, 0, 30, 30, 120, 255, THRESH_BINARY, false)
at(params,0)  # BRIGHTNESS

# video stream
processes = stdvec(cint(0),cint(0))
stdpush!(processes, BRIGHTNESS)
stdpush!(processes, THRESHOLD)
params = videoprocessor(processes, "Videoprocessor")
at(params,0)  # BRIGHTNESS
at(params,1)  # THRESHOLD

Text and drawing functions

Put text on image

filename = joinpath(Pkg.dir("OpenCV"), "./test/images/lena.png")
img = imread(filename)
putText(img, "Hello Lena!", cvPoint(40,40), FONT_HERSHEY_COMPLEX_SMALL, 1.0, cvScalar(255,0,0), 1, LINE_AA, false)
imdisplay(img, "Text")
closeWindows(0,27,"")

Draw geometric shapes (circles, rectangles, etc)

center = cvPoint(260,275)
radius = 30
color = cvScalar(0,0,255)  #red
thickness=4
lineType=LINE_AA
shift = 0
circle(img, center, radius, color, thickness,lineType, shift)
rectangle(img, cvPoint(30,30), cvPoint(150,150), cvScalar(255,0,0), thickness, lineType, shift)
imdisplay(img, "Drawing")
closeWindows(0,27,"")

Advanced interfaces

GPU processing with OpenCL

OpenCV.jl can be accelerated several fold by processing on the GPU with the OpenCL transparent API (T-API). The only requirement is to declare the image/array as cv::UMat (universal Mat) instead of cv::Mat. For example, a simple RGB to gray image conversion can run 10 times faster with GPU compared to CPU (here I used an NVIDIA GTX-Force 330M 512MB, CC 1.2) in OpenCV.jl:

Declare the Mat and UMat (1000x1000 RGB) source and initialize target images
julia> srcMat = Mat(1000, 1000, CV_8UC3, cvScalar(0, 255, 0));
julia> srcUMat = UMat(1000, 1000, CV_8UC3, cvScalar(0, 255, 0));
julia> dstMat = Mat()
julia> dstUMat = UMat()

CPU
julia> @time(cvtColor(srcMat, dstMat, COLOR_BGR2GRAY))
elapsed time: 0.00164426 seconds (80 bytes allocated)

GPU
julia> @time(cvtColor(srcUMat, dstUMat, COLOR_BGR2GRAY))
elapsed time: 0.000149589 seconds (80 bytes allocated)

Demos

The scripts in test/jl/tests.jl illustrate how to use basic OpenCV functions directly in Julia. Demos in test/cxx/demos.jl contain both basic and advanced C++ scripts wrapped with Cxx. You can execute run_tests() to check these examples, including basic image creation, conversion, thresholding, live video, trackbars, histograms, drawing, and object tracking.

Applications in computer vision

There is a rich collection of advanced algorithms/modules for computer vision implemented in OpenCV that are likely to be added in the future. A number of them are found in opencv-contrib e.g.,

Extended documentation

Feel free to send questions, comments or file issues here. Extending the documentation is planned in the context of more specialized applications.

Known issues