wangkuiyi / gotorch

A Go idiomatic binding to the C++ core of PyTorch
MIT License
305 stars 35 forks source link

Random errors in mnist #380

Open XiongUp opened 2 years ago

XiongUp commented 2 years ago

I use the command test mnist and sometimes encountered errors. The following error sometimes does not occur. How can I resolve this problem?

go run mnist.go train -epoch 30
2022/04/07 22:28:37 CUDA is valid
2022/04/07 22:28:52 Train Epoch: 0, Loss: 0.0039, throughput: 7698.638501 samples/sec
2022/04/07 22:28:52 Test average loss: 0.0078, Accuracy: 83.32%
2022/04/07 22:28:59 Train Epoch: 1, Loss: 0.0116, throughput: 9216.783733 samples/sec
2022/04/07 22:29:00 Test average loss: 0.0064, Accuracy: 86.60%
2022/04/07 22:29:06 Train Epoch: 2, Loss: 0.0541, throughput: 9619.654450 samples/sec
2022/04/07 22:29:07 Test average loss: 0.0053, Accuracy: 88.97%
......
2022/04/07 22:32:24 Train Epoch: 16, Loss: 0.0003, throughput: 12521.215569 samples/sec
2022/04/07 22:32:24 Test average loss: 0.0016, Accuracy: 96.81%
2022/04/07 22:32:29 Train Epoch: 17, Loss: 0.0015, throughput: 12425.405242 samples/sec
2022/04/07 22:32:30 Test average loss: 0.0016, Accuracy: 96.92%
2022/04/07 22:32:34 Train Epoch: 18, Loss: 0.0002, throughput: 13145.095818 samples/sec
2022/04/07 22:32:35 Test average loss: 0.0015, Accuracy: 97.01%
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x80 addr=0x0 pc=0x7efcd98f2d58]

runtime stack:
runtime.throw({0x5b77ff?, 0xbed93271bed93271?})
        /usr/local/go/src/runtime/panic.go:992 +0x71
runtime.sigpanic()
        /usr/local/go/src/runtime/signal_unix.go:802 +0x3a9

goroutine 38 [syscall, locked to thread]:
runtime.cgocall(0x5568b0, 0xc00014f528)
        /usr/local/go/src/runtime/cgocall.go:157 +0x5c fp=0xc00014f500 sp=0xc00014f4c8 pc=0x42265c
github.com/wangkuiyi/gotorch._Cfunc_Div(0x7efc501303a0, 0x70226360, 0xc000010610)
        _cgo_gotypes.go:423 +0x4d fp=0xc00014f528 sp=0xc00014f500 pc=0x513bed
github.com/wangkuiyi/gotorch.Div.func1({0xc03f800000?}, {0x2?}, 0x3f80000000000002?)
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/tensor_ops.go:97 +0x9b fp=0xc00014f570 sp=0xc00014f528 pc=0x519d7b
github.com/wangkuiyi/gotorch.Div({0x0?}, {0xc00001c210?})
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/tensor_ops.go:97 +0x45 fp=0xc00014f5b0 sp=0xc00014f570 pc=0x519ca5
github.com/wangkuiyi/gotorch.(*Tensor).Div(...)
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/tensor_ops.go:104
github.com/wangkuiyi/gotorch/vision/transforms.(*NormalizeTransformer).Run(0xc0000d0000, {0xc0000105e0})
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/transforms/normalize.go:40 +0x45 fp=0xc00014f5d8 sp=0xc00014f5b0 pc=0x54e605
runtime.call16(0xc000100690, 0xc0000105f0, 0x0, 0x0, 0x0, 0x10, 0xc00014fb08)
        /usr/local/go/src/runtime/asm_amd64.s:701 +0x49 fp=0xc00014f5f8 sp=0xc00014f5d8 pc=0x47d529
runtime.reflectcall(0x5a8a80?, 0xc0000105e0?, 0x2?, 0x5b12c3?, 0x0?, 0x12?, 0x5a8a80?)
        <autogenerated>:1 +0x3c fp=0xc00014f638 sp=0xc00014f5f8 pc=0x481a3c
reflect.Value.call({0x58c8a0?, 0xc0000d0000?, 0x0?}, {0x5ae89c, 0x4}, {0xc0002520a8, 0x1, 0x0?})
        /usr/local/go/src/reflect/value.go:556 +0x845 fp=0xc00014fc28 sp=0xc00014f638 pc=0x49fb65
reflect.Value.Call({0x58c8a0?, 0xc0000d0000?, 0x0?}, {0xc0002520a8, 0x1, 0x1})
        /usr/local/go/src/reflect/value.go:339 +0xbf fp=0xc00014fca0 sp=0xc00014fc28 pc=0x49f0df
github.com/wangkuiyi/gotorch/vision/transforms.(*ComposeTransformer).Run(0xc000144200?, {0xc0005c9e60?, 0xc0005c9e38?, 0x4326b1?})
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/transforms/transforms.go:30 +0x1d8 fp=0xc00014fdb0 sp=0xc00014fca0 pc=0x54ec58
github.com/wangkuiyi/gotorch/vision/imageloader.(*ImageLoader).collateMiniBatch(0xc000676000, {0xc000144200?, 0x40, 0x40}, {0xc000146000, 0x40, 0x40})
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:230 +0x1c8 fp=0xc00014feb0 sp=0xc00014fdb0 pc=0x550848
github.com/wangkuiyi/gotorch/vision/imageloader.(*ImageLoader).samplesToMinibatches(0xc000676000)
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:182 +0x225 fp=0xc00014ff98 sp=0xc00014feb0 pc=0x550025
github.com/wangkuiyi/gotorch/vision/imageloader.New.func2()
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:98 +0x1d fp=0xc00014ffb0 sp=0xc00014ff98 pc=0x54f83d
github.com/wangkuiyi/gotorch/vision/imageloader.newWorkingThreadGroup.func2()
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:308 +0x43 fp=0xc00014ffe0 sp=0xc00014ffb0 pc=0x550e63
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc00014ffe8 sp=0xc00014ffe0 pc=0x47f201
created by github.com/wangkuiyi/gotorch/vision/imageloader.newWorkingThreadGroup
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:305 +0x31

goroutine 1 [runnable, locked to thread]:
github.com/wangkuiyi/gotorch._Cfunc_ItemFloat64(0x70235d70, 0xc000198000)
        _cgo_gotypes.go:654 +0x4d
github.com/wangkuiyi/gotorch.Tensor.Item.func2({0xc000056db0?}, 0xc000056dd8?)
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/tensor_ops.go:214 +0x4c
github.com/wangkuiyi/gotorch.Tensor.Item({0xc000012040?})
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/tensor_ops.go:214 +0x198
main.train({0x5b805f, 0x2e}, {0x5b7e42, 0x2d}, 0x320, {0x5b36c9, 0x1b})
        /dev/shm/gotorch_projects/test1/mnist.go:84 +0x432
main.main()
        /dev/shm/gotorch_projects/test1/mnist.go:52 +0x428

goroutine 37 [runnable, locked to thread]:
bufio.(*Reader).ReadByte(0xc0005eab40)
        /usr/local/go/src/bufio/bufio.go:262 +0x7a
compress/flate.(*decompressor).huffSym(0xc000662000, 0xc000662028)
        /usr/local/go/src/compress/flate/inflate.go:719 +0x102
compress/flate.(*decompressor).huffmanBlock(0x8c7040?)
        /usr/local/go/src/compress/flate/inflate.go:494 +0x45
compress/flate.(*decompressor).Read(0xc000662000, {0xc000114928, 0x200, 0x4af737?})
        /usr/local/go/src/compress/flate/inflate.go:347 +0x7b
compress/gzip.(*Reader).Read(0xc00011e580, {0xc000114928, 0x200, 0x200})
        /usr/local/go/src/compress/gzip/gunzip.go:251 +0x7a
io.ReadAtLeast({0x5e5d78, 0xc00011e580}, {0xc000114928, 0x200, 0x200}, 0x200)
        /usr/local/go/src/io/io.go:331 +0x9a
io.ReadFull(...)
        /usr/local/go/src/io/io.go:350
archive/tar.(*Reader).readHeader(0xc000114900)
        /usr/local/go/src/archive/tar/reader.go:344 +0x51
archive/tar.(*Reader).next(0xc000114900)
        /usr/local/go/src/archive/tar/reader.go:76 +0x106
archive/tar.(*Reader).Next(0xc000114900)
        /usr/local/go/src/archive/tar/reader.go:51 +0x31
github.com/wangkuiyi/gotorch/vision/imageloader.(*ImageLoader).readSamples(0xc000676000)
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:140 +0x8c
github.com/wangkuiyi/gotorch/vision/imageloader.New.func1()
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:97 +0x1d
github.com/wangkuiyi/gotorch/vision/imageloader.newWorkingThreadGroup.func1()
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:302 +0x43
created by github.com/wangkuiyi/gotorch/vision/imageloader.newWorkingThreadGroup
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:299 +0x25

goroutine 98 [chan send]:
github.com/wangkuiyi/gotorch/vision/imageloader.(*ImageLoader).shuffleSamples(0xc000676120)
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:209 +0x245
created by github.com/wangkuiyi/gotorch/vision/imageloader.New
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:88 +0x3e5

goroutine 40 [chan send, locked to thread]:
github.com/wangkuiyi/gotorch/vision/imageloader.(*ImageLoader).readSamples(0xc000676120)
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:162 +0x24c
github.com/wangkuiyi/gotorch/vision/imageloader.New.func1()
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:97 +0x1d
github.com/wangkuiyi/gotorch/vision/imageloader.newWorkingThreadGroup.func1()
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:302 +0x43
created by github.com/wangkuiyi/gotorch/vision/imageloader.newWorkingThreadGroup
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:299 +0x25

goroutine 41 [chan send, locked to thread]:
github.com/wangkuiyi/gotorch/vision/imageloader.(*ImageLoader).samplesToMinibatches(0xc000676120)
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:182 +0x24e
github.com/wangkuiyi/gotorch/vision/imageloader.New.func2()
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:98 +0x1d
github.com/wangkuiyi/gotorch/vision/imageloader.newWorkingThreadGroup.func2()
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:308 +0x43
created by github.com/wangkuiyi/gotorch/vision/imageloader.newWorkingThreadGroup
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:305 +0x31

goroutine 21 [chan receive, locked to thread]:
github.com/wangkuiyi/gotorch/vision/imageloader.newWorkingThreadGroup.func1()
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:301 +0x52
created by github.com/wangkuiyi/gotorch/vision/imageloader.newWorkingThreadGroup
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:299 +0x25

goroutine 22 [chan receive, locked to thread]:
github.com/wangkuiyi/gotorch/vision/imageloader.newWorkingThreadGroup.func2()
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:307 +0x52
created by github.com/wangkuiyi/gotorch/vision/imageloader.newWorkingThreadGroup
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:305 +0x31

goroutine 33 [chan send]:
github.com/wangkuiyi/gotorch/vision/imageloader.(*ImageLoader).shuffleSamples(0xc000676000)
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:209 +0x245
created by github.com/wangkuiyi/gotorch/vision/imageloader.New
        /home/xjun/GOPATH/pkg/mod/github.com/wangkuiyi/gotorch@v0.0.0-20201028015551-9afed2f3ad7b/vision/imageloader/imageloader.go:88 +0x3e5
exit status 2