mar-file-system / marfs

MarFS provides a scalable near-POSIX file system by using one or more POSIX file systems as a scalable metadata component and one or more data stores (object, file, etc) as a scalable data component.
Other
96 stars 27 forks source link

Bad marfs configuration can break multi files smaller than max_pack_file_size in pftool #154

Closed cadejager closed 8 years ago

cadejager commented 8 years ago

Jeff reported this bug: $ getfattr -d /gpfs/marfs-gpfs/jti/mdfs/temp/1x100G/f01 user.marfs_objid="proxy/repo2/ver.001_004/ns.jti/P___/inode.0000367118/md_ctime.20160817_141812-0600_1/obj_ctime.20160817_141812-0600_1/unq.0/chnksz.c00/chnkno.0" user.marfs_post="ver.001_004/P/off.0/objs.832358004/bytes.39953184240/corr.0000000000000000/crypt.0000000000000000/flags.00/mdfs."

--- chunk 0 is at the "packed" object-ID: (contains one chunk's-worth of data, plus recovery_ $ curl -I -u root --digest http://10.135.0.30:81/proxy/repo2/ver.001_004/ns.jti/P___/inode.0000367118/md_ctime.20160817_141812-0600_1/obj_ctime.20160817_141812-0600_1/unq.0/chnksz.c00/chnkno.0

HTTP/1.1 200 OK

--- chunk 1? $ curl -I -u root --digest http://10.135.0.30:81/proxy/repo2/ver.001_004/ns.jti/P___/ ... /chnkno.1

HTTP/1.1 404 Not Found

--- chunks after 0 use "N": $ curl -I -u root --digest http://10.135.0.30:81/proxy/repo2/ver.001_004/ns.jti/N___/ ... /chnkno.1

HTTP/1.1 200 OK

Thanks, Jeff

I was not able to replicate this but I was able to get a seg fault if I set the chunk_size below the max_pack_file_size and then copy a file in between the size of chunk_size and max_pack_file_size.

cadejager commented 8 years ago

I believe I have fixed this issue in the branch issue-154. If Jeff could confirm this with his configuration I would appreciate it and then we can merge this into master.

jti-lanl commented 8 years ago

Your patch does fix a core-dump in a case I saw. But the original problem was still happening.

A simple tweak to your fix fixes this. Your fix made that possible.

[This exercise revealed a different marfs bug, which I'll submit as a separate issue.]