Closed GoogleCodeExporter closed 8 years ago
Can you do a "ls -al" for the volume server's directory, where the *.dat and
*.idx files are stored?
And I suppose the disk have enough spaces left, right?
Original comment by chris...@gmail.com
on 4 Jul 2013 at 3:55
yes, there are 4.4Tb free space
please see the uploaded screen-shot
Original comment by hieu.hcmus@gmail.com
on 4 Jul 2013 at 4:05
Attachments:
Looks like some .dat file is exceeding the size limit, 32*1024*1024*1024 =
34359738368 bytes.
I will need to add one additional check at the volume server level to prevent
this.
Original comment by chris...@gmail.com
on 4 Jul 2013 at 5:01
checked in a fix just now. I have not tried your test suite. Please run your
test suite to confirm.
Original comment by chris...@gmail.com
on 4 Jul 2013 at 5:16
I just tested and errors still occurred.
There are no message indicate that the .dat file is exceeding the size limit
Original comment by hieu.hcmus@gmail.com
on 4 Jul 2013 at 6:44
I just checked and only 2 volumes 18 and 21 are possible to download,
All the rest volumes are not possible to download because it read the wrong
header value.
Original comment by hieu.hcmus@gmail.com
on 4 Jul 2013 at 8:28
There are error occurred when converting between uint32 and uint64:
[2013/07/04 16:17:46.242178] [TRAC] (storage.(*Volume).write:156) Append offset
uint32: %!(EXTRA uint32=1565170750)
[2013/07/04 16:17:46.242178] [TRAC] (storage.(*Volume).write:157) Append offset
uint64: %!(EXTRA int64=12521366002)
[2013/07/04 16:17:46.242178] [TRAC] (storage.(*Needle).Append:71) Appended
header: %!(EXTRA []uint8=[101 136 31 154 0 0 0 0 0 80 66 228 0 0 122 221])
[2013/07/04 16:17:46.242178] [TRAC] (storage.(*Volume).write:166) Write n.Size:
31453, Needle id: 5260004, Needle cookie%!(EXTRA uint32=1703419802)
[2013/07/04 16:17:46.242178] [TRAC] (main.PostHandler:222) Uploaded file size:
%!(EXTRA uint32=31391)
[2013/07/04 16:17:46.242178] [TRAC] (main.PostHandler:226) Upload completed
[2013/07/04 16:17:46.242178] [TRAC] (main.GetOrHeadHandler:114) Download:
/13,5042e465881f9a
[2013/07/04 16:17:46.242178] [TRAC] (storage.(*Volume).read:197) Volume Id: 13
[2013/07/04 16:17:46.242178] [TRAC] (storage.(*Volume).read:198) Append offset
uint32: %!(EXTRA uint32=1565170750)
[2013/07/04 16:17:46.242178] [TRAC] (storage.(*Volume).read:199) Read offset
uint64: %!(EXTRA int64=12521366000)
[2013/07/04 16:17:46.242178] [TRAC] (storage.(*Needle).Read:139) Read header:
%!(EXTRA []uint8=[0 0 101 136 31 154 0 0 0 0 0 80 66 228 0 0])
Write: The value in uint32 = 1565170750, uint64 = 12521366002
Read: The value in uint32 = 1565170750, uint64 = 12521366000
Original comment by hieu.hcmus@gmail.com
on 4 Jul 2013 at 9:22
There are error when computing padding size because 12521366002%8 != 0
Original comment by hieu.hcmus@gmail.com
on 4 Jul 2013 at 9:31
We should check the padding value when writing files, I added the below code
and it works fine:
func (v *Volume) write(n *Needle) (size uint32, err error) {
if v.readOnly {
err = fmt.Errorf("%s is read-only", v.dataFile)
return
}
v.accessLock.Lock()
defer v.accessLock.Unlock()
var offset int64
if offset, err = v.dataFile.Seek(0, 2); err != nil {
return
}
//check padding
if offset % NeedlePaddingSize != 0 {
offset = offset + (NeedlePaddingSize - offset % NeedlePaddingSize)
if offset, err = v.dataFile.Seek(offset, 0); err != nil {
return
}
}
//end
if size, err = n.Append(v.dataFile, v.Version()); err != nil {
if e := v.dataFile.Truncate(offset); e != nil {
err = fmt.Errorf("%s\ncannot truncate %s: %s", err, v.dataFile, e)
}
return
}
nv, ok := v.nm.Get(n.Id)
if !ok || int64(nv.Offset)*NeedlePaddingSize < offset {
logger.LoggerVolume.Trace("Write n.Size: %d, Needle id: %d, Needle cookie", n.Size, n.Id, n.Cookie)
_, err = v.nm.Put(n.Id, uint32(offset/NeedlePaddingSize), n.Size)
}
return
}
Original comment by hieu.hcmus@gmail.com
on 4 Jul 2013 at 9:49
Can you please attach the whole volume.go file which was used to generate logs
in the comment #7 ?
You fix seems can avoid the problem with 7/8 probability, because a random
offset has 1/8 chances to pass your test.
Original comment by chris...@gmail.com
on 5 Jul 2013 at 7:03
Please find the attached volume.go file
Original comment by hieu.hcmus@gmail.com
on 5 Jul 2013 at 7:11
Attachments:
Thanks! Was your error output in comment #7 generated after my fix?
My fix is during writing period. So if you are continue to read or write
existing volumes, you will see errors.
To use my fix, you would need to clean everything and restart your test from an
empty system.
Original comment by chris...@gmail.com
on 5 Jul 2013 at 7:22
Hi Chris,
I tested yesterday and the files were not written in the full volumes, I don't
think your fix can fix this error
Original comment by hieu.hcmus@gmail.com
on 5 Jul 2013 at 7:30
I re-thought about your fix. It can ensure current file are written kind of
correctly, but it will likely over-write on other existing files.
So we need to ensure when size limit is exceeded, we fail the write attempt and
ask the user to get another file id from the master.
Original comment by chris...@gmail.com
on 5 Jul 2013 at 7:32
[deleted comment]
There are no message "Volume Size Limit %d Exceeded! Current size is %d" in the
log file
Original comment by hieu.hcmus@gmail.com
on 5 Jul 2013 at 7:38
Hi Chris,
Can yoy please explain:
"But it will likely over-write on other existing files."
If something wrong with writing/computing padding value of the file(IO
interrupt...), every later files will be stored wrongly.
I added this code to make sure that if something wrong with one file, it will
not affect later files
Original comment by hieu.hcmus@gmail.com
on 5 Jul 2013 at 7:50
I think your guess is right that my fix seems not related to the issue. (but it
should be OK to leave the fix there)
We need to find out why the offset can be different from what we expected, by
how much. There are several possibilities:
1. we have an error when writing previous file.
2. the offset returned from v.dataFile.Seek(0, 2) is wrong by a few bytes
3. the offset returned from v.dataFile.Seek(0, 2) is wrong randomly
If it is case 3, we will overwrite existing files.
Can you help to identify which case is causing your problem?
Original comment by chris...@gmail.com
on 5 Jul 2013 at 8:06
Hi, Hieu,
Your fix should be good. The current actual disk writing is done in several
write() calls. If one of them failed, the offset would be incorrect, making all
the following files wrong.
It would be helpful to find out what really was wrong in the first place, but
your fix should be a very good way to prevent all following file read/write
errors.
Original comment by chris...@gmail.com
on 5 Jul 2013 at 11:01
Checked in the fix to HEAD. Thanks!
If possible, please let me know what was the error that caused the padding
alignment error.
Original comment by chris...@gmail.com
on 5 Jul 2013 at 11:07
Original issue reported on code.google.com by
hieu.hcmus@gmail.com
on 4 Jul 2013 at 2:22Attachments: