DLR-RM / AugmentedAutoencoder

Official Code: Implicit 3D Orientation Learning for 6D Object Detection from RGB Images
MIT License
339 stars 97 forks source link

Segmentation Fault (core dumped) #10

Closed eyildiz-ugoe closed 5 years ago

eyildiz-ugoe commented 5 years ago

I keep getting "Segmentation Fault" whatever the model I feed in. I tried 5 so far, what's going on I have no idea, but they weren't scaled. I think this can be registered as a bug.

Instructions for updating:
Use the retry module or similar alternatives.
128 128 3
[[8, 8], [16, 16], [32, 32], [64, 64]]
(?, 128, 128, 3)
(?, 128, 128, 3)
Segmentation fault (core dumped)  

Here are some models I tried: model3, model4, model5

Here is my hw setup:

Kernel: 4.15.0-45-generic x86_64 bits: 64 gcc: 7.3.0
Desktop: Gnome 3.28.3 (Gtk 2.24.32) 
Distro: Ubuntu 18.04.1 LTS
Machine:   Device: desktop 
Mobo: ASUSTeK 
model: P9X79 PRO v: Rev 1.xx serial: N/A
BIOS: American Megatrends v: 2002 date: 06/18/2012
CPU:       6 core Intel Core i7-3930K (-MT-MCP-) arch: Sandy Bridge rev.7 cache: 12288 KB
           flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) bmips: 38425
           clock speeds: max: 3800 MHz 1: 1406 MHz 2: 1313 MHz 3: 1401 MHz 4: 1240 MHz 5: 1232 MHz
           6: 1405 MHz 7: 1327 MHz 8: 1524 MHz 9: 1503 MHz 10: 1377 MHz 11: 1935 MHz 12: 1285 MHz
Graphics:  Card: NVIDIA GP106 [GeForce GTX 1060 6GB] bus-ID: 01:00.0
           Display Server: x11 (X.Org 1.19.6 )
           drivers: nvidia (unloaded: modesetting,fbdev,vesa,nouveau)
           Resolution: 1600x1200@60.00hz, 1920x1200@59.95hz
           OpenGL: renderer: GeForce GTX 1060 6GB/PCIe/SSE2
           version: 4.6.0 NVIDIA 390.77 Direct Render: Yes
Audio:     Card-1 NVIDIA GP106 High Def. Audio Controller driver: snd_hda_intel bus-ID: 01:00.1
           Card-2 Intel C600/X79 series High Def. Audio Controller
           driver: snd_hda_intel bus-ID: 00:1b.0
           Card-3 Logitech Headset H340 driver: USB Audio usb-ID: 002-005
           Card-4 Logitech Webcam C270 driver: USB Audio usb-ID: 002-006
           Sound: Advanced Linux Sound Architecture v: k4.15.0-45-generic
Network:   Card: Intel 82579V Gigabit Network Connection
           driver: e1000e v: 3.2.6-k port: f040 bus-ID: 00:19.0
           IF: eno1 state: up speed: 1000 Mbps duplex: full mac: 30:85:a9:98:15:7f
Drives:    HDD Total Size: 250.1GB (47.3% used)
           ID-1: /dev/sda model: Samsung_SSD_850 size: 250.1GB temp: 0C
Partition: ID-1: / size: 229G used: 111G (51%) fs: ext4 dev: /dev/sda1
RAID:      No RAID devices: /proc/mdstat, md_mod kernel module present
Sensors:   System Temperatures: cpu: 30.0C mobo: N/A gpu: 1.0:35C
           Fan Speeds (in rpm): cpu: 0
Info:      Processes: 418 Uptime: 2:59 Memory: 4648.7/15982.4MB
           Init: systemd runlevel: 5 Gcc sys: 6.5.0 Client: Shell (bash 4.4.191) inxi: 2.3.56 
MartinSmeyer commented 5 years ago

Can you please list your hardware setup?

eyildiz-ugoe commented 5 years ago

Edited

MartinSmeyer commented 5 years ago

Sorry, but I can't reproduce your error, training with MODEL: cad works fine on your model (even without scaling, although training views are all black in that case) I just tried this on five different computers with following graphic cards:

GeForce GTX 1080 Ti (11GB) GeForce GTX 1080 (8GB) Quadro K620 (2GB) Tesla K20c (4.7GB) GeForce GTX Titan X (12GB)

Driver Version: 390.77

Only the Quadro gave me an OutOfMemory error or a Segmentation Fault.

Which Tensorflow version do you use? Is the memory of your graphics card/RAM filling up before the segmentation fault? Can you backtrace your segmentation fault?

Thanks for your help.

eyildiz-ugoe commented 5 years ago

I have Tensorflow 1.7.0 with CUDA 9.

I don't know how to debug this error to be honest, which file is kicking in? What should I check? I run as instructed only this ae_train exp_group/my_autoencoder -d

I could try this on another computer (remote connection, no GUI) but your code seems to be opening up windows and hence it doesn't run over remote computers. If there is a way to disable GUI, like commenting out some lines of code that causes popups or any other GUI-related thing, I could comment them out and run it on that computer.

beingkartik commented 5 years ago

Hey, were you able to solve this problem?

Update :#9

flugenheimer commented 5 years ago

I get this when running cad files and set to model to cad, but wgeb running reconst it works. I have tried it on the T-LESS models: http://cmp.felk.cvut.cz/t-less/download.html