erigones / esdc-ce

Danube Cloud :: Community Edition
https://danubecloud.org
Apache License 2.0
127 stars 28 forks source link

Increase number of possible disks for VM #445

Open klebed opened 5 years ago

klebed commented 5 years ago

Introduce global and local domain settings allowing to create more than two disks in one VM Bond DISK_ID_MAX to this settings.

At this moment only two disks might be created in one VM in v4.1, which is strange and not very reasonable limit.

dn0 commented 5 years ago

Blocked by https://github.com/joyent/smartos-live/issues/331

klebed commented 5 years ago

Is it still there? 5 years passed, and lot's of QEMU updates was made, though... and I've never seen that bug under linux QEMU-KVM hosts...

Also, if you allow to manage this limit with settings, it seems like owner of system would be in charge of it's behaviour, keeping in mind, that more than 3 disks causes CD boot to fail. And it would be easier to track, if issue is still there, by checking.

YanChii commented 5 years ago

@klebed

This looks quite dangerous to me. If we allow user to have 3+ disks and you deploy from an image, it will probably work because cdrom is not involved.

BUT: if something happens and your VM refuses to boot, the rescue CD will fail to help you and you end up with a choice of deleting a disk (data loss) or hardcore hacking using zfs cloning or similar thing that can mess the system consistency.

Jan

PS: you can still get 3+ disks by manual vmadm update or hacking the startvm.zone script. The difference is that this way you know you have done that and it is your responsibility to keep it in mind (in contrast to regular user that happily adds 3 disks because system allows it and then becomes trapped without choosing it voluntarily).

klebed commented 5 years ago

BUT: if something happens and your VM refuses to boot, the rescue CD will fail to help you and you end up with a choice of deleting a disk (data loss) or hardcore hacking using zfs cloning or similar thing that can mess the system consistency.

Well... This could be workarounded with an option to disable disk temporarely (it would be deleted from VM config, but being tracked by danube as relative to VM with status disabled) If you want to boot with rescuecd though, just disable unnecessary disks.

On the other hand isn't it just a messy bug that breaks advantages of qemu on smartos? Might be we could also make some noise around it, and try to locate it's causing source?

YanChii commented 5 years ago

Sure we can make together some fuzz around it. Or to fix it. Maybe it's not that difficult. And definitely less difficult than the workaround.

But there's another thing: this feature request is bigger than it looks at the first sight... at least in DC. It requires changing also the migration, replication and backup scripts, as well as some internal handling of multiple disks. And all this effort brings a questionable advantage - all disks have to be on the same zpool anyway (SmartOS design), so there's not much difference between 3x 100GB disks and 1x 300GB disk that is partitioned inside a VM.

I don't want to create excuses. It is possible. I just want to be practical here because time is not infinite and no one else requests this feature (so far). Personaly, I'd rather see a DC bhyve support than a second disk in KVM (comparable effort).

But we definitely welcome any help and if you are willing to invest some time in this multi-disk feature, we will help you to make it happen.

Jan

klebed commented 5 years ago

Ok, I hear the reasons.

It seems like someone already found solution in thread, mentioned by dn0. It's all about index order in generated qemu command. Cdrom appears as ide disk, and it should be index=1 to be able to boot it. So far it seems straightforward solution is to locate command generation code and adjust it so cdrom should appear as second device with index=1, no matter which qty of devices following it.

klebed commented 5 years ago

Found the part of code to adjust, but it's designed that way, so it hard to manipulate output in proper way to achieve right result, or the code would become a bit messy. https://github.com/joyent/smartos-live/blob/master/src/vm/node_modules/VM.js function startVM we interested in for (disk in vmobj.disks) loop and also next extra payload loop for (disk in extra.disks) ...

The trick is that we should squeeze extra disk (bootable CD) between first and second entries from the first loop, giving our bootable CD index=1. Problem is that we should determine that extra disk is a bootable CD and only then include it. I'm afraid this could be very dirty hack.

Other way is to change the way how DC generates CreateVM input file, putting the bootable CD inside first set of disks as second one. - this could be more clear.

YanChii commented 5 years ago

I've found the same code :) https://github.com/joyent/smartos-live/blob/master/src/vm/node_modules/VM.js#L12646

Are you sure inserting bootable CD into disks section will work out of the box? ISO files are loop-mounted. Also not sure about once= flag if cdrom is defined in json. But it would be a clean solution indeed.

Fix of startVM in VM.js looks quite tricky. Maybe some check before this line that looks into extra.boot params would do the job. If the cdrom and order=d or once=d is found, you can insert a disk from extra.disks and delete it from that list so it will not be inserted twice here. But it seems to me complicated more than necesary.

klebed commented 5 years ago

I have concerns that hacking VM.js would make its logic complicated and not expectable. One may find such disk meshing very surprising when using it manually, or from another management console.

I guess for cleaner way it may be done in esdc-ce code, but I've failed to find where vm config being generated... I find Django project structure very tangled... =( My suggestion is to put CDrom device inside main disks array after the first entry.

YanChii commented 5 years ago

The code is here https://github.com/erigones/esdc-ce/blob/master/api/vm/status/vm_status.py#L120

Maybe @dn0 will have something to say to to this.

YanChii commented 5 years ago

@klebed I'm not aware it is possible to put cdrom inside the disks. It needs to be handled completely different way. Maybe it is possible to implement it in vmadm and I agree it would be cleaner than my previous proposal.

Pls let us know if you have some way how to call vmadm boot differently so the cdrom gets to the right place. Temporarily modifying json just to boot from cdrom is impractical and possibly even dangerous.

Jan

klebed commented 5 years ago

IMHO There is nothing to do with vmadm... It should work straightforward, and if we say it to shoot in the leg, it should... On the other hand there is definitely bug in qemu, which makes machine unstable in certain configuration of disks. So we have two ways: 1) Not to say vmadm to shoot in the leg, and place bootable CD-rom as second entry in the disks with index=1 (to avoid hitting qemu bug) 2) Locate code and reasons, causing qemu bug. My opinion is that moving CD entry is much easier. And it has no need to be handled different way, since there is barely no difference in entries coming from disks or extra.disks.

Please take a look at @dn0 comment: image

YanChii commented 5 years ago

@klebed Anyway, if you manage to modify it in the platform, we can accept it to our DC platform regardless the SmartOS upstream. Just create a PR to erigones/smartos-live. The only requirement is that it needs to be backward compatible not to break older use case. Jan

YanChii commented 4 years ago

FYI This will not be an issue with bhyve VMs. Considering the complications, we'll invest time into bhyve integration rather than fixing the obsolete KVM code.