danpedron / snake-os

Automatically exported from code.google.com/p/snake-os
0 stars 1 forks source link

Borked firmware install - have ssh access #220

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. upgraded to latest experimental
2. after 1 hour seemed to have locked up
3. rebooted
4. could log in, but corrupt settings??
5. unable to flash via web interface
6. can login via ssh

What is the expected output? What do you see instead?

can mount usb disk, web interface seems to think it's not mounted. Unable to 
create swapspace, create shares or enable services. see attached.

unable to re-flash via web interface due to...
*****************
Validating file...
File md5sum is: cf74726efa1b3f75f081aeffee38514e , and expected md5sum is:
ERROR: Wrong md5sum inside file, try to download the file again.
*****************

What version of the product are you using? On what operating system?

NS-K330, snake os 1.3.2-20111017

Please provide any additional information below.

I can access via ssh, so i would like to be able to re-flash via command-line 
if poss. I have an old nokia usb cable if it comes to that.

Original issue reported on code.google.com by sa...@webber.co.nz on 21 Oct 2011 at 11:41

Attachments:

GoogleCodeExporter commented 8 years ago
scratch the part about having the nokia cable, it's a DKU-2, which looks like 
it won't work

Original comment by sa...@webber.co.nz on 22 Oct 2011 at 12:10

GoogleCodeExporter commented 8 years ago
I guess the root cause it that it's somehow not seeing your disk anymore. I'm 
surprised that the web updater even get's this far if that's the case though. 
It should complain if it can't find a disk.

Have you tried to do a config reset yet? I don't think it will help if the disk 
really isn't recognized but it's the easiest thing to try.

Could you post the system log?

If the system log shows that the disk is recognized could you try to mount it 
manually to somewhere .. like:
mkdir /usb/mydisk
mount /dev/sda1 /usb/mydisk/

If you really want to flash via ssh try this, but it's somewhat risky and I'd 
like to figure out what broke first. 

first.. reset your config and reboot.

the web updater would stop samba and opkg but I don't think it's strictly needed
/etc/init.d/opkg stop
/etc/init.d/samba stop

# check if you have at least ~10mb free memory
free

# increase upper limit for ram filesystem at /tmp to 8mb
mount -o remount,size=8m /tmp/ 

from another pc:
# copy image over. 
# replace 192.168.0.240 with ip address of the box 
# replace snakeos-V1.3.2-20111019-from-snake.bin with the name of the 
-from-snake image that you wan't to use.
# DON'T use a -from-original image
# winscp will probably work too for this
scp snakeos-V1.3.2-20111019-from-snake.bin root@192.168.0.240:/tmp/ 

back to snake:
# change into /tmp
cd /tmp

# check the md5sum of the image and compare it with the download
# it should return one of these depending on the image you want to flash:
# 4dd962f1d38a1e0839c658ff3b5825b1  snakeos-V1.3.2-20101130-from-snake.bin
# 3c793e1c0e19f103beca5764895cb215  snakeos-V1.3.2-20111007-from-snake.bin
# 3476bc9ea0c7b844ff938771939d9312  snakeos-V1.3.2-20111019-from-snake.bin
# you can skip this if you want but I wouldn't recommend it.
md5sum snakeos-V1.3.2-20111019-from-snake.bin 

# copy busybox to tmp
cp /bin/busybox  ./

# write the image to flash
# be careful.. if you get this wrong the system is hosed
/tmp/busybox dd if=snakeos-V1.3.2-20111019-from-snake.bin of=/dev/mtdblock1 
bs=65536 count=2
/tmp/busybox dd if=snakeos-V1.3.2-20111019-from-snake.bin of=/dev/mtdblock2 
bs=65536 skip=2 count=15
/tmp/busybox dd if=snakeos-V1.3.2-20111019-from-snake.bin of=/dev/mtdblock3 
bs=65536 skip=17 count=42

# reboot the device
reboot

Original comment by stefansc...@googlemail.com on 22 Oct 2011 at 10:35

GoogleCodeExporter commented 8 years ago
thanks for that information:) However I will see about finding the root cause 
first.

I can see through command-line that the disk is actually mounted - and the 
partition section of the web interface says that it is too. I can mount and 
unmount just fine.

Tried resetting the config and no dice.

The thing dies whenever I try to get the logs through the web interface and I 
have to hard reset it (have to wait at least 15 seconds before I power it back 
on !). I am wondering if it is overheating.

It's not only trying to get the logs that it crashes, in fact many tasks - 
including formatting, attempting firmware upgrade all cause it to lock up (most 
of the time).

When I unboxed the device, I immediately installed snake 1.3.2-20111007 and the 
install went fine. However I couldn't format my 2TB drive, as it would keep 
locking up and I would have to reset it. I eventually used gparted to format it 
to ext3 and everything looked ok. I created a share and tested a copy via 
samba. After I hit cancel on the transfer, it locked up and wouldn't respond to 
pings - this triggered me to look for upgrades.

From what I read, other people are not having these issues, and there are a few 
people who have added a heatsink. Do you think it is possible that this is 
getting too hot?

If I can sustain an terminal session long enough, I will get some logs. WinSCP 
moans about sftp and scp not being accessible.

cheers.

Original comment by sa...@webber.co.nz on 22 Oct 2011 at 7:03

GoogleCodeExporter commented 8 years ago

I completed the above, using smb to copy to usb - although the web interface 
was unable to configure smb, i was able to edit smb.conf manually. I tried to 
flash 1.3.2-20111019-from-snake.bin, and it appeared to be successful, so I 
rebooted. It came up, still saying 20111007 as the version, and still had the 
same problem. I thought I would repeat the procedure again, using the 2010 
version, and on the second dd command above, it hung. I could still ping the 
device and could spawn another ssh session, however no command would work, it 
would just sit there with a flashing cursor. I left it for a couple of hours 
and power cycled it. Now it is dead :( Never mind, my new serial cable as well 
as 2 new units are on there way from DX.

Would be interested in your opinion as to what you think could cause this. I 
believe the unit is faulty due to the problems I had in the beginning before 
all this.

/var/messages - http://pastebin.com/80x1tsG4

Cheers.

Original comment by sa...@webber.co.nz on 22 Oct 2011 at 11:06

GoogleCodeExporter commented 8 years ago
Sorry, sftp was broken in 20111007. I thought you were running 20111019. (P)scp 
would have probably still worked.

I don't really know what's wrong. Could be a faulty unit, that the usb 
controller/driver doesn't like your disks or maybe even that some bits got 
flipped on the initial installation of snake.. but that's unlikely.

There's nothing unusual in that log. It's odd that accessing the log page would 
lock up the device. The log is stored in ram and the page is served from the 
flash. There should be no disk accesses at all.

Was that usb stick always connected?

Did it even lock up with no devices attached?

Original comment by stefansc...@googlemail.com on 23 Oct 2011 at 12:29

GoogleCodeExporter commented 8 years ago
I was trying to run 20111119 that's all.

The USB stick was an after thought, it was locking up without that too. 

It's good to know that this isn't the norm, I have 2 new units on the way. 

Do you think it is necessary to add heatsink?

Cheers

Original comment by sa...@webber.co.nz on 23 Oct 2011 at 5:16

GoogleCodeExporter commented 8 years ago
Might be enough to run it with an opened case for a bit.

I tried to measure the chip temperature today, without case mine idles at about 
44°C and goes up to 48°C under load. At the bootloader prompt it can be 
clocked down to 175mhz, at that frequency it idles at 40°C. I just taped the 
temperature probe of my multimeter to the chip, so the values might not be 
totally accurate.. but it doesn't seem that high.

Original comment by stefansc...@googlemail.com on 23 Oct 2011 at 4:40