Open GoogleCodeExporter opened 9 years ago
It is an arithmetic issue (big numbers with 3TB disks, probably awk %d should
be replaced with a %f).
The issue must be at /usr/www/cgi-bin/status.cgi, at around line 295 (where
$mdev is md0 in your case)
compl=$(drawbargraph $(awk '{printf "%d", $1 * 100 / $3}'
/sys/block/$mdev/md/sync_completed))
speed=$(cat /sys/block/$mdev/md/sync_speed)
exp=$(awk '{printf "%.1fmin", ($3 - $1) * 512 / 1000 / '$speed' / 60}'
/sys/block/$mdev/md/sync_completed 2> /dev/null)
If it is still resync can you please post the output of
cat /sys/block/md0/md/sync_completed
cat /sys/block/md0/md/sync_speed
Thanks
Original comment by whoami.j...@gmail.com
on 1 May 2013 at 2:46
Still going. Web currently says md0 2794.0 GB raid1 active OK resync 20%
142.2min
which I would think was ok except /proc/mdstat sadly disagrees :-(
$ cat /proc/mdstat
Personalities : [linear] [raid1]
md0 : active raid1 sda2[1] sdb2[0]
2929740112 blocks super 1.2 [2/2] [UU]
[===============>.....] resync = 78.6% (2303036672/2929740112) finish=108.9min speed=95859K/sec
bitmap: 6/22 pages [24KB], 65536KB chunk
unused devices: <none>
$ cat /sys/block/md0/md/sync_completed
310542592 / 1564512928
$ cat /sys/block/md0/md/sync_speed
89338
awk saying 20% is about right for that sync_completed numbers.
And I just manually tried some big numbers in awk, and it doesn't seem to
overflow, so I think awk must be using fp or longs for those calculations
already.
So looks like we have a kernel overflow issue here...
Yeah, I just had a look at kernel source md.c, sync_completed_show function.
It uses unsigned long in 2.6.25, and has been fixed to long long sometime since.
Might be wise to change the web script to parse it out of /proc/mdstat instead!
Original comment by brian.br...@gmail.com
on 1 May 2013 at 6:12
/proc/mdstat contains very different type/formated information, it is difficult
to parse it.
I'm trying to port Alt-F to a more recent kernel, 3.8.11, and perhaps that will
solve the issue.
Original comment by whoami.j...@gmail.com
on 24 May 2013 at 11:48
From what I saw in the kernel source, it was definitely fixed by that
version.
Original comment by brian.br...@gmail.com
on 25 May 2013 at 11:31
I can confirm this on my recently flashed DLINK DNS-323 running Alt-F 0.1RC3. I
built a 2x3TB array Raid1 and am seeing he same here - Currently:
RAID
Dev. Capacity Level State Status Action Done ETA
md0 2794.0 GB raid1 active OK resync 210% -152.4min
Original comment by crazymac...@gmail.com
on 28 Jun 2013 at 10:47
Original comment by whoami.j...@gmail.com
on 29 Jun 2013 at 4:27
Same here, see my ticket on sourceforge for details:
https://sourceforge.net/p/alt-f/tickets/10/
RAID
Dev. Capacity Level State Status Action Done ETA
md0 2794.0 GB raid1 active OK resync 158% -6517.8min
how can I make sure this is a false positive and that the resyncing is actually
done?
Stephane
Original comment by stephane...@gmail.com
on 2 Sep 2013 at 6:29
[deleted comment]
I tried the same commands on my box. our problem looks similar
$ cat /sys/block/md0/md/sync_completed
2564345088 / 1564512928
$ cat /sys/block/md0/md/sync_speed
150809
$ cat /proc/mdstat
Personalities : [linear] [raid1]
md0 : active raid1 sda2[1] sdb2[0]
2929740112 blocks super 1.2 [2/2] [UU]
[========>............] resync = 43.8% (1285270528/2929740112) finish=189.7min speed=144439K/sec
bitmap: 13/22 pages [52KB], 65536KB chunk
unused devices: <none>
Original comment by stephane...@gmail.com
on 2 Sep 2013 at 6:47
Don't worry, cat /proc/mdstat is telling the truth about what's happening, its
only the other numbers used by the web interface that are overflowing. (We
found there was a fixed kernel bug)
Original comment by brian.br...@gmail.com
on 2 Sep 2013 at 7:32
Yep I realized that. Thanks Brian!
Original comment by stephane...@gmail.com
on 2 Sep 2013 at 9:46
Original issue reported on code.google.com by
brian.br...@gmail.com
on 1 May 2013 at 1:53