mar-file-system / marfs

MarFS provides a scalable near-POSIX file system by using one or more POSIX file systems as a scalable metadata component and one or more data stores (object, file, etc) as a scalable data component.
Other
96 stars 27 forks source link

Packed Files fail to update user.marfs_post offset after overwrite with FUSE #171

Closed gsparrow closed 6 years ago

gsparrow commented 7 years ago

When creating packed files with PFTool, if after the initial creation, you then use a MarFS FUSE mount to overwrite one of the files that were packed, it fails to update the offset of the user.marfs_post extended attribute. This affects files whose offset were not 0 when in the packed object. The issue does not occur when the file is overwritten with PFTool, as evidenced by the test MarFS_issue_171_PFTool in Jenkins.

Below is a small reproducer, Named "MarFS_issue_171_FUSE" in Jenkins

export TEST_DIRECTORY=packed_test_POSIX_MarFS_using_PFTool_then_overwrite

export PATH=/usr/lib64/openmpi/bin:$PATH
export P=$WORKSPACE/$TEST_DIRECTORY
export C=$MOUNT_POINT

export LOCAL_FILE_1=$P/test_file1.txt
export LOCAL_FILE_2=$P/test_file2.txt

export MARFS_FILE_1=$C/$TEST_DIRECTORY/test_file1.txt
export MARFS_FILE_2=$C/$TEST_DIRECTORY/test_file2.txt

export GPFS_FILE_1=$GPFS_MOUNT_POINT/$TEST_DIRECTORY/test_file1.txt
export GPFS_FILE_2=$GPFS_MOUNT_POINT/$TEST_DIRECTORY/test_file2.txt

trap 'rm -rf $C/$TEST_DIRECTORY $P file.txt file2.txt' EXIT

mkdir $P
echo "Hello World" > $LOCAL_FILE_1
echo "Hello Worl" > $LOCAL_FILE_2

mpirun -np $NP $PFTOOL -w 0 -r -c $C -p $P
ls -l $MARFS_FILE_1 > file.txt
FILE_SIZE_0=`cut -d ' ' -f 5 file.txt`
echo $FILE_SIZE_0
ls -l $LOCAL_FILE_1 > file2.txt
FILE_SIZE_1=`cut -d ' ' -f 5 file2.txt`

if [ $FILE_SIZE_0 != $FILE_SIZE_1 ]
then
    exit 1
fi

ls -l $MARFS_FILE_2 > file.txt
FILE_SIZE_0=`cut -d ' ' -f 5 file.txt`
echo $FILE_SIZE_0
ls -l $LOCAL_FILE_2 > file2.txt
FILE_SIZE_1=`cut -d ' ' -f 5 file2.txt`

if [ $FILE_SIZE_0 != $FILE_SIZE_1 ]
then
    exit 1
fi

if [ $FILE_SIZE_0 != $FILE_SIZE_1 ]
then
    exit 1
fi

if ! mpirun -np $NP $PFTOOL -w 2 -r -M -c $C/$TEST_DIRECTORY -p $P ; then
    echo "The Files were different somehow"
    exit 1
fi

if (( 1 == $VERBOSE )); then
    echo "Copied directory successfully"
fi

#Now try to find out if the files are packed files in GPFS
FILE_TYPE=`getfattr -n user.marfs_post $GPFS_FILE_1 | cut -d '=' -f 2 |cut -d$'\n' -f 2 | cut -d '/' -f 2`
if [ "P" != "$FILE_TYPE" ]; then
    echo "The File1 Type was not a packed file!"
    exit 1
fi
FILE_TYPE=`getfattr -n user.marfs_post $GPFS_FILE_2 | cut -d '=' -f 2 |cut -d$'\n' -f 2 | cut -d '/' -f 2`
if [ "P" != "$FILE_TYPE" ]; then
    echo "The File2 Type was not a packed file!"
    exit 1
fi

if (( 1 == $VERBOSE )); then
    echo "Copied directory successfully"
fi

#Now try to find out if the files share the same Object ID
FILE1_INODE=`getfattr -n user.marfs_objid $GPFS_FILE_1 | cut -d '=' -f 2 | cut -d$'\n' -f 2 | cut -d '/' -f 6 | cut -d '.' -f 2`
FILE2_INODE=`getfattr -n user.marfs_objid $GPFS_FILE_2 | cut -d '=' -f 2 | cut -d$'\n' -f 2 | cut -d '/' -f 6 | cut -d '.' -f 2`

if [ "$FILE1_INODE" != "$FILE2_INODE" ]; then
    echo "File1 and File2 inodes were different, meaning they are not a packed file."
    exit 1
fi

if ((1 == $VERBOSE)); then
    echo "They were all part of a single object"
fi

PRE_OFFSET=`getfattr -n user.marfs_post $GPFS_FILE_1 | cut -d '=' -f 2 |cut -d$'\n' -f 2 | cut -d '/' -f 3 | cut -d '.' -f 2`
echo "H" > $MARFS_FILE_1
echo "H" > $LOCAL_FILE_1
sleep $SLEEP_TIME
POST_OFFSET=`getfattr -n user.marfs_post $GPFS_FILE_1 | cut -d '=' -f 2 |cut -d$'\n' -f 2 | cut -d '/' -f 3 | cut -d '.' -f 2`
if (( $PRE_OFFSET == $POST_OFFSET)); then
    echo "The Pre-Offset is $PRE_OFFSET"
    echo "The Post-Offset is $POST_OFFSET"
    echo "These should not be te same, as the Post-Offset is for a new object."
    exit 1
fi

FILE_TYPE=`getfattr -n user.marfs_post $GPFS_FILE_1 | cut -d '=' -f 2 |cut -d$'\n' -f 2 | cut -d '/' -f 2`
if [ "P" == "$FILE_TYPE" ]; then
    echo "The File2 Type was a packed file!"
    exit 1
fi

FILE1_INODE=`getfattr -n user.marfs_objid $GPFS_FILE_1 | cut -d '=' -f 2 | cut -d$'\n' -f 2 | cut -d '/' -f 6 | cut -d '.' -f 2`
if [ "$FILE1_INODE" == "$FILE2_INODE" ]; then
    echo "File1 and File2 inodes were the same, meaning they are a packed file."
    exit 1
fi

if ((1 == $VERBOSE)); then
    echo "They all remained part of a single object except the changed file"
fi

echo "SUCCESS"
wfvining commented 7 years ago

I have a fix in the commit above. Needs integration testing.

wfvining commented 7 years ago

resolved by e510acb21

cadejager commented 6 years ago

Using master now this test fails again. It actually breaks all of the items in the pack so that none of them work. I rewrote a file with pftool and it did not cause this issue so it is not a production critical bug as far as I can tell.

dejager@stb-fta05:/marfs.dejager/mc/dejager/mHellos$ cat hello.1
goodbye
dejager@stb-fta05:/marfs.dejager/mc/dejager/mHellos$ cat hello.100
goodbye
dejager@stb-fta05:/marfs.dejager/mc/dejager/mHellos$ echo nuget > hello.100
dejager@stb-fta05:/marfs.dejager/mc/dejager/mHellos$ cat hello.1
cat: hello.1: Input/output error
dejager@stb-fta05:/marfs.dejager/mc/dejager/mHellos$ cat hello.101
cat: hello.101: Input/output error
dejager@stb-fta05:/marfs.dejager/mc/dejager/mHellos$ cat hello.100
cat: hello.100: Input/output error
dejager@stb-fta05:/marfs.dejager/mc/dejager/mHellos$
jti-lanl commented 6 years ago

I can't reproduce this in the current rdma branch (i.e. rdma branches of erasureUtils, marfs, and pftool).

I suspect this was fixed somewhere between d1b718a5 and 53cecca26 in the marfs rdma lineage.

jti-lanl commented 6 years ago

In conversation, I had mentioned concern about whether renames could also have problems, in whatever branch of code was being used to reproduce the problem. That was because of issue #200.

So, I've also tested renames of packed files (in the rdma branches), in the context of #171, and they seem problem-free, as well. A renamed packed file retains its identity in the packed file. If it is subsequently overwritten through fuse, the correct thing happens, acquiring new contents in a distinct object, and corresponding updates to MD.

cadejager commented 6 years ago

Renaming does not cause the issue fortunately.

jti-lanl commented 6 years ago

Can't reproduce the problem in marfs 1.10.

jti-lanl commented 6 years ago

Can't reproduce this in 1.10