asb / spindle

http://asbradbury.org/projects/spindle/
Other
185 stars 43 forks source link

How To Build A Distribution Image With Spindle? #101

Closed stevedee closed 11 years ago

stevedee commented 12 years ago

I want to try to build my own image with Spindle, but know very little about the process. I'm hoping this post will result in some smart people telling how this should be done.

Following the outline here: http://asbradbury.org/projects/spindle/ ...this is what I've done so far:-

sudo ./setup_spindle_environment my_spindle_chroot ...created user & password...then run.. sudo modprobe nbd max_part=16 ...then... schroot -c spindle ...then... ./wheezy-stage0

I then get to: "Creating journal (32768 blocks):" ...and the computer sits there for hours. As I write this, my latest attempt on my most powerful computer (which is not saying much) has been stuck at this point for over 6 hours.

Assuming its not supposed to take this long, I guess its not going to work.

Using "top" I can see that about 1.4GB RAM is in use, and cpu %wa is sitting around 98%.

Now I could bore you with machine specification & OS info, but suspect at this stage that my method is just completely wrong.

Can anyone advise?

bairdy commented 12 years ago

Hi i have tried doing this aswell with debian squeeze and wheezy as the main os on my machine to make the image. I've been trying for the past 2 days but no luck, with the method described and the latest source on github.

Got any tips asb?

asb commented 11 years ago

Sorry for the delayed response. Are you seeing the same issue as stevedee bairdy? I have seen this on another machine (not mine) but haven't been able to reproduce on any of mine. Seems to be a qemu-nbd issue.

stevedee commented 11 years ago

{Sorry, didn't mean to close this issue, I must have touched the wrong button!!!}

Hi asb, I assume from your comment that I'm not doing anything too crazy.

I didn't understand (from the original instructions) whether I was supposed to do anything about qemu. The version installed on this laptop (Lubuntu 11.04) for qemu & qemu-common is 0.14.0+noroms-0ubuntu4.5

I'm happy to install another Linux distro on a spare computer if that helps.

bairdy commented 11 years ago

Hi asb yeah im having the same issue tried using various os but having no luck.

lookshe commented 11 years ago

Hi, same problem on my machine with Debian Squeeze installed. Script wheezy-stage0 started and hanging on line "dotask sudo mkfs.ext4 -O ^huge_file $NBD_DEV" (46). mkfs is running for hours and nothing really happens, not mentionable CPU and RAM usage. Usage goes up, if I try to kill the process, but waiting hours and it's still running. Only rebooting my machine helps. In the next days I try all steps until this line without chroot and report what I can find out.

lookshe commented 11 years ago

Seems to be problem with the format of the qemu images. I changed everything from qed to raw and now I got over this point.

asb commented 11 years ago

Thanks lookshe, looks like a qed bug we may need to chase. How about qcow2? Seems odd that it only seems to be a problem for some people though.

lookshe commented 11 years ago

Looks good, stage1 running and now on do_second_stage_debootstrap.

asb commented 11 years ago

Well, at least that gives a good temporary fix and something to investigate in order to report a bug upstream (would help if I could reproduce the issue...).

ghost commented 11 years ago

Same issue on a Ubuntu 12.04 64-bit host. Hangs forever at "Creating Journal (32768 blocks):

stbuehler commented 11 years ago

the qemu-nbd process segfaults (have a look at dmesg output), and the nbd device completely locks up. can't kill qemu-nbd nor mkfs, needs a reboot...

JayNaire commented 11 years ago

On Debian Squeezy (64bit) and spindle cloned today (13/Aug/12) : I change all qed references to raw in wheezy-stage0 to stop mkfs hanging then schrooted and ran wheezy-stage0 again: mkfs.ext4 hung with: _ext2fs_check_ifmount: Can't check if filesystem is mounted due to missing mtab file while determining whether /dev/nbd0 is mounted Reason - no /etc/mtab (so I just touched one to get it going) Running wheezy-stage0 again goes well until the end where I got: *Initiating cleanup umount: /dev/nbd0: not mounted /dev/nbd0 disconnected /build/buildd-qemu_1.1.0+dfsg-1-amd64-qCg31j/qemu-1.1.0+dfsg/nbd.c:nbd_trip():L836: From: 18446744073709551104, Len: 0, Size: 1939865600, Offset: 0

/build/buildd-qemu_1.1.0+dfsg-1-amd64-qCg31j/qemu-1.1.0+dfsg/nbd.c:nbdtrip():L837: requested operation past EOF--bad client? Completed script successfully* accompanied by dmesg errors from nbd (loads of them, including): [ 5055.304828]FAT: utf8 is not a recommended IO charset for FAT filesystems, filesystem will be case sensitive! [ 5055.394082] EXT4-fs (nbd0p2): mounted filesystem with ordered data mode [ 5060.462180] JBD: barrier-based sync failed on nbd0p2-8 - disabling barriers [ 5203.835135] nbd0: NBDDISCONNECT [ 5203.835543] nbd0: Unexpected reply (ffff88003cc67b60)

asb commented 11 years ago

How about qcow2 instead?

JayNaire commented 11 years ago

asb: Same kind of errors with qcow2. BUT with either image type I do get out/stage0.raw (or .qcow2) and I get the feeling that it may actually be OK (my gut feeling: there is possibly something fishy in the common script regarding TRAP and universal_cleanup which is detaching/unmounting /dev/nbd0 and then nbd tries to access it again somewhere ). Will adapt wheezy-stage1 and press on later.

asb commented 11 years ago

The frustrating thing is I am completely unable to reproduce any of these issues on my machine. Some people say qcow2 works while qed doesn't, while for you neither seem to work...I could just give up on qemu-nbd and do everything (fs creation etc) inside the qemulated environment.

JayNaire commented 11 years ago

Alex: I sympathise with your frustration 100% ! On a positive note though I have now managed to run everything successfully - albeit with some knife and forking (mainly changing formats to raw (or qcow2 in export_image_for_release in the common script). Whether it will boot or not I shall find out tomorrow. Incidentally, on my system it seems that qemu-nbd is segfaulting when an nbd device is being unmounted; this causes general mayhem but it doesn't matter as the image has been written correctly by then anyway. Also there appears to be some serious delays in nbd devices being registered as mounted; (I notice some sleeps around so I guess this has been an issue before).

asb commented 11 years ago

What distro, kernel version etc are you using?

JayNaire commented 11 years ago

Linux debian64bit-vm 2.6.32-5-amd64 #1 SMP Squeeze Stable GNU bash, version 4.1.5(1)-release (x86_64-pc-linux-gnu)

On 08/15/2012 01:32 AM, Alex Bradbury wrote:

What distro, kernel version etc are you using?

— Reply to this email directly or view it on GitHub https://github.com/asb/spindle/issues/101#issuecomment-7744924.

JayNaire commented 11 years ago

Alex: I got there in the end. Pretty much every problem was caused when unmounting nbd's. There were other inexplicable problems like "cp $ROOT_DEV/boot/* $BOOT_DEV" failing silently; catting polkit setup stuff failing for no good reason and fifos not being created (although actually they had). The default root and pi passwd setup failed somewhere (impossible to login - very strange). Anyway I have a reasonable method now to create an img - so thank you for your support and hard work.

super-nathan commented 11 years ago

I have Finally got it figured out where i can get to be able to run ./wheezy-stage0 I have the same problem listed above, where i hang on journal creation and I segfault. Can only be fixed with a hard reboot.

Debian=Sid AMD 64 Kernel=3.6.9 (siduction)

mik3y commented 11 years ago

FYI, I experience the same segfault in qemu-nbd, during the first parted access to /dev/nbd0:

In parted:

(spindle) $ sudo parted /dev/nbd0
GNU Parted 2.3
Using /dev/nbd0
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mklabel msdos                                                    
Error: end of file while reading Success                                  
Retry/Ignore/Cancel

Meanwhile in dmesg:

[617102.202066]  nbd0: unknown partition table
[617130.843756] show_signal_msg: 11 callbacks suppressed
[617130.843767] qemu-nbd[2584]: segfault at 0 ip 00007f358cd1ac98 sp 00007f358ccb7cc0 error 4 in qemu-nbd[7f358ccd0000+6b000]

Host OS is Ubuntu 12.04 on amd64 arch.

Edit: Setting IMGFORMAT=raw worked around the issue, but caused problems later:

(spindle) $ ./wheezy-stage1
Unknown option 'backing_file'
qemu-img: Backing file not supported for file format 'raw'
'branch_image ../out/stage0.raw stage1.raw' failed (returned 1) - Quit? [Y/n] 

It looks like the -b option is only supported by some qemu image types (man page]. IMGFORMAT=qcow2 worked around the segfault and this issue.

asb commented 11 years ago

Multiple people have had success when following the current README instructions including downgrading qemu, closing for now.