TritonDataCenter / sdc-imgapi

SDC internal API for managing OS images
Mozilla Public License 2.0
5 stars 18 forks source link

Potential issue importing custom KVM images to local storage on 20151001-20151005T134459Z-g022a17a #10

Closed ghost closed 8 years ago

ghost commented 8 years ago

Sorry if this is in the wrong place/repo, seemed like the most reasonable place to report. Feel free to flame me if I'm wrong ;)

Ran into an issue after importing a custom KVM image to local storage that made KVM deployment inoperable.

Brief summary;

I had also noticed that after doing the manual import with sdc-imgadm, AdminUI could no longer autocomplete image UUIDs from names (or the first few characters of the UUID). This issue went away once I'd deleted the custom image, and AdminUI autocomplete worked again. This is what pointed me towards a potential issue with IMGAPI.

When starting a provision job that eventually times out, there is network and disk activity on the compute node, but it is only shortlived and the box becomes idle again almost immediately. Looking through the logs, the provision job does make it to CNAPI and to the compute node, but no dataset is created for the VM disk, and nothing ever makes it to QEMU.

After having this issue, I tried a complete update from the release and support channels - the issue persisted. I then ran a complete upgrade from the dev channel (at 1530UTC 14/10/2015 - don't have any build numbers to hand, but I could get them), and the problem went away entirely - custom images could be deployed, and AdminUI image autocomplete worked as expected.

In the process of trying to fix this, I must have reinstalled four or five times on fresh disks, thinking I'd messed something up. Every time, the same issue.

Although the servers that were having the issues are now working, I do have another set of servers (same model/specifications) if you cannot reproduce the problem and need logs. Some guidance on which logs you need would be useful.

I can also provide the full list of commands I used to set up the servers (including the USB config edits and NAPI calls I made to set up networking, etc.), and the commands I used to upgrade to a release that worked. Let me know if that is needed.

askfongjojo commented 8 years ago

@themisanthrope - Sorry the search in AdminUI was broken while we implemented https://smartos.org/bugview/ADMINUI-2117. The fix is in 29a2254 which is already available on the dev channel and will also be available in release-20151015 going out tomorrow.

If you have provisioned your KVM instances through AdminUI, it would have failed to set the correct "brand" and resulted in the CNAPI error you've seen. (It was working for SmartOS because brand was set to "joyent" by default.) If this doesn't explain the issue or the issue still persists in the newer release, please feel free to reopen this for further investigation. Thanks.