Unicamp-OpenPower / minicloud

Minicloud website
https://openpower.ic.unicamp.br/minicloud/
MIT License
13 stars 2 forks source link

power9 machines are verry sluggish #26

Open kraj opened 4 years ago

kraj commented 4 years ago

when ppc9 machine is chosen (from Availabablity Zone) it can not boot. Tried ubuntu 19.04/18.04/Debian-9, Fedora-29 seems to boot ok. Debian like systems are stuck like below in console. power8 boots ok on debian-like systems

Screen Shot 2020-01-30 at 4 24 53 PM
rpsene commented 4 years ago

@sitio-couto could you pair with @lcnzg today and check what is going on?

sitio-couto commented 4 years ago

@kraj, I was unable to reproduce the issue, all Debian derivatives seem to work. Are you still having this issue? If so, is there any custom configuration you've used when creating the instance?

kraj commented 4 years ago

I just open ssh port and launch the largest instance nothing special.

rpsene commented 4 years ago

@sitio-couto take a look at the placement, maybe one of the P9s is the issue.

kraj commented 4 years ago

@sitio-couto take a look at the placement, maybe one of the P9s is the issue.

I created 4-5 times and same issue all the time. Fedora 29 booted but was very very slow on p9 now I have provisioned a p8 VM and its chugging along ubuntu 19.04 well. I can build yocto images

sitio-couto commented 4 years ago

@sitio-couto take a look at the placement, maybe one of the P9s is the issue.

Both P9's are scheduling the Debian derivatives, although the load is unbalanced and launch times are a bit high (about 5-8 minutes). I'm doing more tests to see if I can get an instance stuck during boot.

rpsene commented 4 years ago

@kraj I was able to boot Ubuntu 18.04 and 19.04 on P9.

Screen Shot 2020-01-31 at 16 36 42
kraj commented 4 years ago

@rpsene ok. I have exhausted my compute quota for provisioning my p8 vm, for now, once I have done some build tests I will tear it down and re-install with ubuntu 19.04 on p9, How long should I wait for it to boot ?

kraj commented 4 years ago

OK I launched the instance and let it boot, next day I See the console shows the login prompt. ssh also works as seen below. But the instance is so slow its quite visible. Perhaps thats another issue ?

ubuntu@yocto-power9:~$ cat /proc/cpuinfo
processor       : 0
cpu             : POWER9 (architected), altivec supported
clock           : 2250.000000MHz
revision        : 2.2 (pvr 004e 1202)
sitio-couto commented 4 years ago

Hey kraj. You're right, the power 9 machines are quite sluggish. We're trying to pinpoint where's the problem, and as soon as we find out and fix it we'll let you know. For now, it might be better to stick with the Power8 availability zone. Apologies for the inconvenience.

kraj commented 4 years ago

@sitio-couto thanks for confirmation. No issues, I am happy to test it out. Let me know when you have it fixed. I started an update on the p9 VM about 6 hrs ago and it has not finished. Usually it should take less than 5 mins.

sitio-couto commented 4 years ago

Hey @kraj. We've done updating and testing or the P9 machines. They seem to be running normally now. Give then a try later and let us know if you have any issues.

kraj commented 4 years ago

I launched a max instance and it boots but I cant ssh into it I see

ssh: connect to host minicloud1 port XXXXX: No route to host

rpsene commented 4 years ago

@kraj did you enable ssh in the security group? ssh @minicloud.parqtec.unicamp.br -p xxxxx

kraj commented 4 years ago

@kraj did you enable ssh in the security group? ssh @minicloud.parqtec.unicamp.br -p xxxxx

yes. I did, same works ok when I launch ubuntu 19.04 on ppc8 box.

rpsene commented 4 years ago

@kraj I have deployed many VMs on P9 yesterday without any issues. I need to understand the steps you are using and whether or not you as customizing something.

kraj commented 4 years ago

Fact that I launch power8 machine in exact same process kind of validate my approach, I just forcibly chose power9 when trying to get machine type.

sitio-couto commented 4 years ago

Hey @kraj. Besides adding the SSH rule to the default security group, could you double-check the if the following configurations were correct when creating and accessing the instance:

kraj commented 4 years ago

@sitio-couto right. I have exact same steps followed. Booting power9 does take a bit longer. But I also tried to delete/add the ssh rule after system was fully booted but to no avail. I launch max instance and choose power9 during building the VM

sitio-couto commented 4 years ago

In this case, do you remember the VM’s IP or have tried to launch another VM with a different IP?

kraj commented 4 years ago

My current. VM IP I remember, but not the old one sorry. Unfortunatley, I have created a max instances spending all my quota, I can tear it down and try again perhaps on weekend and pass the IP

bencz commented 4 years ago

I'm getting the same error on P9 machine...

The log:

[0m Build Date = Dec 13 2017 13:46:58 FW Version = buildd@ release 20170724 Press "s" to enter Open Firmware.

[0m[?25hC0000C0100C0120C0140C0200C0240C0260C02E0C0300C0320C0340C0360C0370C0380C0371C0372C0373C0374C03F0C0400C0480C04C0C04D0C0500Populating /vdevice methods Populating /vdevice/vty@30000000 Populating /vdevice/nvram@71000000 C05A0Populating /pci@800000020000000 00 0800 (D) : 1af4 1000 virtio [ net ] 00 1000 (D) : 1b36 000d serial bus [ usb-xhci ] 00 1800 (D) : 1af4 1001 virtio [ block ] 00 2000 (D) : 1af4 1002 unknown-legacy-device* 00 2800 (D) : 1234 1111 qemu vga C0600C06C0C0700C0800C0880No NVRAM common partition, re-initializing... C0890C08A0C08A8Installing QEMU fb

C08B0Scanning USB XHCI: Initializing USB Keyboard C08C0C08D0No console specified using screen & keyboard C08E0C08E8C08FF
Welcome to Open Firmware

Copyright (c) 2004, 2017 IBM Corporation All rights reserved. This program and the accompanying materials are made available under the terms of the BSD License available at http://www.opensource.org/licenses/bsd-license.php

Trying to load: from: /pci@800000020000000/scsi@3 ... Successfully loaded HV-RTAS-UPDATE error: -4 Linux ppc64le

1 SMP Debian 4.

And, the system stay on this step....

martin-frbg commented 4 years ago

Are there any updates on expected POWER9 availability ? It seems one can only choose POWER8 as the availability zone right now, no matter which of the provided images one selects ?

rpsene commented 4 years ago

@martin-frbg not yet, we need to get a dependency fixed given the issues is happening with the KVM version set for the version of OpenStack we have configured. We hope to get the P9s back soon.

kraj commented 4 years ago

my account is disabled so I have no way to validate

gregorkistler commented 2 years ago

@rpsene do you have any news about POWER9 availability? :) As Ubuntu 22.04 only runs on POWER9+ I'm curious how your roadmap currently looks like.

jr-santos98 commented 2 years ago

@gregorkistler We've already started working on the Micloud update. In about the next two months, we're supposed to have Power9 back on MInicloud.

gregorkistler commented 2 years ago

Great news, thank you :)