CentOS / sig-cloud-instance-build

CentOS Cloud Instance SIG: Metadata to build & release instances
362 stars 168 forks source link

kernel panic when shuting down qemu #93

Open asantos82 opened 7 years ago

asantos82 commented 7 years ago

Hi,

I was trying to build some tar rootfs to used with docker, but the creating is failing when the qemu is shunting down, and shows some kind of kernel panic. I can not figure out what is wrong.

Can you help me?

Attaching the virt-install.log

virt-install.log.txt

Thanks

lpancescu commented 7 years ago

This looks like a memory corruption issue that the kernel detects. It looks more than just a single flipped bit:

>>> "{:#016x}".format(0xffff88002e857e08 ^ 0xffff88002d16e368)
'0x00000003939d60'

You could take a look if other panics affect similar bits. I would suspect bad memory on the host - perhaps running a memory checking tool like memtest86+ will find something.

asantos82 commented 7 years ago

Hi @lpancescu ,

I have run the memtest86+ on the host and no problems have been found. photo5956339016817289365

This is happening when building the image on "bare metal" running Fedora

lpancescu commented 7 years ago

Hi @asantos,

I found something similar to your log messages in the upstream changelog for kernel 4.9.15. Please search for commit 48e2181b0b8d1a1e226b2932a11d6f94aef28fb8 in the changelog (the stack trace looks different from yours, though). Fedora 25 is currently on 4.9.14, but I see 4.10.5 available in Bodhi, if you want to test with that.

In any case, I don't think this is an issue with our images or build process; if I were you, I would try to file a bug with Fedora against the kernel package, or ask on the Fedora support channels. Another possibility would be to switch to an enterprise distro like CentOS or Debian (stable), they tend to be less affected by such bugs than bleeding edge distros like Fedora.

asantos82 commented 7 years ago

Hi @lpancescu

I can not try the 4.10.5 now, but will try it when released on the Fedora Repo.

I have tried to build the image on another "Bare Metal" with

cat /etc/redhat-release 
CentOS Linux release 7.3.1611 (Core) 
uname -r
3.10.0-514.6.1.el7.x86_64

with success!!

2017-03-24 14:27:58,908: Disk Image install successful
2017-03-24 14:27:58,909: SUMMARY
2017-03-24 14:27:58,909: -------
2017-03-24 14:27:58,909: Logs are in /tmp
2017-03-24 14:27:58,909: Disk image is at /var/tmp/centos-7-docker.tar.xz
2017-03-24 14:27:58,909: Results are in /var/tmp

real    5m2.894s
user    1m21.512s
sys 0m1.996s
Disk image is at /root/asantos_tst/centos-7-docker.tar.xz

Thanks for your help

asantos82 commented 7 years ago

Hi,

I have tried again, in the Fedora host(4.10.8-200.fc25.x86_64) to build the CentOS tar root fs and it keeps failling

Best Regards, André

lpancescu commented 7 years ago

Hi André,

that's unfortunate. Perhaps you can file a kernel bug with Fedora, if there isn't one already.

Best regards, Laurențiu

lpancescu commented 7 years ago

@asantos82 @kbsingh Since this seems to be a regression in the Fedora kernel, not a problem with our Vagrant images, and there has been no further feedback since April, I would propose closing this issue.

asantos82 commented 7 years ago

Hi,

I am on 4.11.11-200.fc25.x86_64 and still having this issue.