wenbowen123 / BundleTrack

[IROS 2021] BundleTrack: 6D Pose Tracking for Novel Objects without Instance or Category-Level 3D Models
Other
615 stars 66 forks source link

illegal instruction (core dumped) #43

Closed eneserdo closed 2 years ago

eneserdo commented 2 years ago

Thanks for the great work. I was trying to run predictions on YCBInEOAT by following your guide, and I run the command:

python scripts/run_ycbineoat.py --data_dir ycb_dir/bleach0 --port 5555 --model_name 021_bleach_cleanser

it gave this error:

/home/airlab/enes/bundle/BundleTrack/scripts/../build/bundle_track_ycbineoat /tmp/config_ycb_dir.yml illegal instruction (core dumped)

First, I suspected from tensorflow version because old CPUs do not support AVX instruction which is used by newer tensorflow versions, but while lfnet container works without an error, this happened in the main container. And, afaik, there is no tensorflow in there.

Here is my lscpu | grep Flags output to compare with yours:

flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt aes lahf_lm pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat flush_l1d

What could cause this? And How can I solve it? Thanks in advance.

wenbowen123 commented 2 years ago

I'm not sure how it happens. Most people seem to never meet this problem as they were able to run it without mentioning this in previous issues. I put my machine's flags here for reference any way:

fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti retpoline intel_ppin tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm ida arat pln pts

Perhaps you can first identify if the error occurs when running my code or 3rd party, e.g. opencv/PCL.

ChenxMa commented 8 months ago

Thanks for the great work. I was trying to run predictions on YCBInEOAT by following your guide, and I run the command:

python scripts/run_ycbineoat.py --data_dir ycb_dir/bleach0 --port 5555 --model_name 021_bleach_cleanser

it gave this error:

/home/airlab/enes/bundle/BundleTrack/scripts/../build/bundle_track_ycbineoat /tmp/config_ycb_dir.yml illegal instruction (core dumped)

First, I suspected from tensorflow version because old CPUs do not support AVX instruction which is used by newer tensorflow versions, but while lfnet container works without an error, this happened in the main container. And, afaik, there is no tensorflow in there.

Here is my lscpu | grep Flags output to compare with yours:

flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt aes lahf_lm pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat flush_l1d

What could cause this? And How can I solve it? Thanks in advance.

Hi, maybe it's too long to remember, but I also encountered the same problem, have you found any solutions yet?

eneserdo commented 8 months ago

@ChenxMa I do not remember really, maybe I did not use it afterward. But it was probably due to the CPU. Did you try it on a different machine?

ChenxMa commented 8 months ago

@ChenxMa I do not remember really, maybe I did not use it afterward. But it was probably due to the CPU. Did you try it on a different machine?

Yes, I changed a different machine, and the code runs well.