Image preprocessing pipeline

QiJune commented 4 years ago

Image preprocessing pipeline in GoTorch

a jpg image file ---> image.YCbCr format --> image.NRGBA format ---> Go array --> GoTorch Tensor

Use image.Decode method to read a jpg image file into YCbCr
Use imaging.scan method to transform YCbCr format to NRGBA format (high cost)
Do some transformations, such as ResizeCrop
Copy NRGBA format image to a continuous local Go array variable (high cost)
Create a tensor view of the Go array, using FromBlob method
Make a deep copy of the tensor, because the Go array may be freed (high cost)

Image preprocessing pipeline in PyTorch

a jpg image file ---> PIL image of RGB format ---> PyTorch Tensor

Read a job image file into PIL RGB format
Do some transformations, such as ResizeCrop
Create a tensor view of the PIL image, using as_tensor method

Problems

We need to read a jpg image file into RGB format directly
We could not use disintegration/imaging library, since it will always transform an image into NRGBA format before doing preprocessing. NRGBA is not continuous, it inserts two no use channels.
We need a preprocessing library which could handle RGB image directly, so that the memory layout remains continuous all the time.

Conclusions

It seems that we could not use the Go image library and the disintegration/imaging library at all.

We need an independent and high-efficient image preprocessing library.

Maybe gocv is an option.

QiJune commented 4 years ago

I find that opencv is 2 times faster.

package transforms

import (
    "fmt"
    "image"
    "image/jpeg"
    "os"
    "testing"
    "time"
    "unsafe"

    torch "github.com/wangkuiyi/gotorch"
    "gocv.io/x/gocv"
)

func TestJPG(t *testing.T) {
    fileName := "188242.jpg"
    size := 224
    startTime := time.Now()

    for i := 0; i < 100; i++ {
        file, _ := os.Open(fileName)
        defer file.Close()
        img, _ := jpeg.Decode(file)
        trans1 := Resize(size, size)
        o1 := trans1.Run(img)

        trans2 := ToTensor()
        _ = trans2.Run(o1).Clone()
    }
    fmt.Println(time.Since(startTime).Seconds())

    startTime = time.Now()
    for i := 0; i < 100; i++ {
        imgCv := gocv.IMRead(fileName, gocv.IMReadColor)
        defer imgCv.Close()
        gocv.CvtColor(imgCv, &imgCv, gocv.ColorBGRToRGB)
        gocv.Resize(imgCv, &imgCv, image.Point{size, size}, 0, 0, 1)
        imgCv.ConvertTo(&imgCv, gocv.MatTypeCV32FC3)
        imgCv.MultiplyFloat(1.0 / 255.0)
        view, _ := imgCv.DataPtrFloat32()
        tensor := torch.FromBlob(unsafe.Pointer(&view[0]),
            torch.Float, []int64{int64(size), int64(size), 3})
        tensor.Permute([]int64{2, 0, 1})
    }

    fmt.Println(time.Since(startTime).Seconds())
}

go test github.com/wangkuiyi/gotorch/vision/transforms -v -run JPG -count=1
=== RUN   TestJPG
0.399983358
0.166660883
--- PASS: TestJPG (0.58s)
PASS
ok      github.com/wangkuiyi/gotorch/vision/transforms  1.318s

go test github.com/wangkuiyi/gotorch/vision/transforms -v -run JPG -count=1
=== RUN   TestJPG
0.42601828
0.149734561
--- PASS: TestJPG (0.59s)
PASS
ok      github.com/wangkuiyi/gotorch/vision/transforms  0.876s

go test github.com/wangkuiyi/gotorch/vision/transforms -v -run JPG -count=1
=== RUN   TestJPG
0.384203016
0.156132424
--- PASS: TestJPG (0.55s)
PASS
ok      github.com/wangkuiyi/gotorch/vision/transforms  0.786s

QiJune commented 4 years ago

We decide to use gocv to do transforms in GoTorch. The following are some basic operations provided by gocv:

Resize: gocv.Resize
Crop: img.Region
Flip: gocv.Flip

wangkuiyi / gotorch

Image preprocessing pipeline #323

Image preprocessing pipeline in GoTorch

Image preprocessing pipeline in PyTorch

Problems

Conclusions