plfs / plfs-core

LANL no longer develops PLFS. Feel free to fork and develop as you wish.
41 stars 36 forks source link

Empty index files when using fopen() #171

Closed johnbent closed 11 years ago

johnbent commented 11 years ago

After looking into MILC's I/O code, I found that it uses fopen(). When the time to save checkpoints comes, many ranks open the same file by fopen() and then write to the file. By doing similar things with a simple program, I was able to reproduce the no-index-file error. This means there no index dropping in the backends, and plfs_map shows that the file has 0 entry.

The reproducer is attached. The main.c attached is just a simple program using fopen() to open a file and write some data to it.

#include <stdlib.h>
#include <stdio.h>

int main(int argc, char **argv)
{
    FILE *fp;

    if ( argc != 3 ) {
        printf("Usage: %s filepath openmode\n", argv[0]);
        printf("filepath and openmode will be passed to"
                " fopen() directly.\n");
        exit(1);
    }

    fp = fopen(argv[1], argv[2]);
    if ( fp == NULL ) {
        perror("fopen:");
        exit(1);
    }
    int i;
    for ( i = 0 ; i < 1000 ; i++ ) {
        fwrite("abc", 1, 3, fp); 
    }
    fclose(fp);
}

Do the following to run the reproducer: $ make $ # modify plfs dir path in runtest.sh $ ./runtest.sh

You will see something like this (plfs_map output of two files):

Application 221459 resources: utime ~0s, stime ~0s
# Index of /lustre/lscratch/.plfs_store_n1//jun/milcdumps/Feb04152457548257787.1.w.dat <------------------ generated with 1PE
# Data Droppings
#0 /lustre/lscratch/.plfs_store_n1//jun/milcdumps/Feb04152457548257787.1.w.dat/hostdir.10//dropping.data.1360016697.852671.nid00028.26304
# Entry Count: 1<------------------ have index
# ID Logical_offset Length Begin_timestamp End_timestamp  Logical_tail ID.Chunk_offset 
    0 w                0     3000 1360016697.8543920516967773 1360016697.8964900970458984             2999  [0.         0]
/plfs/lscratch_n1/jun/milcdumps/Feb04152501960829664.16.w.dat w
Application 221460 resources: utime ~0s, stime ~0s
# Index of /lustre/lscratch/.plfs_store_n1//jun/milcdumps/Feb04152501960829664.16.w.dat <------------------ generated with 16PE
# Data Droppings
# Entry Count: 0<------------------ no index
# ID Logical_offset Length Begin_timestamp End_timestamp  Logical_tail ID.Chunk_offset 

If you do ls on the backend of the 16PE file:

sm-login1 workdir/myfopens> ls /lustre/lscratch/.plfs_store_n1//jun/milcdumps/Feb04152501960829664.16.w.dat/* -l
-rw-r--r-- 1 jun jun    0 Feb  4 15:25 /lustre/lscratch/.plfs_store_n1//jun/milcdumps/Feb04152501960829664.16.w.dat/version-tag.2.2.3-svn.exported-dat.1.0-chk.3423

/lustre/lscratch/.plfs_store_n1//jun/milcdumps/Feb04152501960829664.16.w.dat/hostdir.10:
total 4
-rw-r--r-- 1 jun jun 3000 Feb  4 15:25 dropping.data.1360016702.289992.nid00028.26324  <------------------ no index dropping in this folder

/lustre/lscratch/.plfs_store_n1//jun/milcdumps/Feb04152501960829664.16.w.dat/meta:
total 0
-rw-r--r-- 1 jun jun 0 Feb  4 15:25 3000.48000.1360016702.989586.nid00028
sm-login1 workdir/myfopens> 

Some observations:

  1. Running with 1 PE was fine. Running with 2 or more PE causes the no-index error.
  2. Opening with fopen(path, "a") was fine with 2+ PE. Opening with fopen(path, "w") and 2+ PE causes the no-index error.
  3. When running MILC, sometimes it failed and sometimes not. There might be some race conditions.
  4. The same reproducer canNOT reproduce the error on RRZ.
  5. Aaron pointed out Smog has only one backend but RRZ has multiple ones. It might relate.
johnbent commented 11 years ago

One more observation: If you use

fd = open( argv[1], O_WRONLY|O_CREAT, 0600 ); ... write()

everything is fine (have index), even with 2+ PE on Smog.

///////////////////////////// Using open() ////////////////////////////////////////

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main(int argc, char **argv)
{
    int fd;
    fd = open( argv[1], O_WRONLY|O_CREAT, 0600 );
    if ( fd == -1 ) {
        printf("open error\n");
        exit(1);
    }
    int i;
    for ( i = 0 ; i < 1000 ; i++ ) {
        write(fd, "abc", 3); 
    }
    close(fd);
}
johnbent commented 11 years ago

Link to reproducer:

https://dl.dropbox.com/u/3442222/github_attachments/zero-index.tar.gz

brettkettering commented 11 years ago

Try reproducing on Cielito while Smog's Lustre is unavailable.

atorrez commented 11 years ago

I have been unable to reproduce this on cielito. I have used version 2.4 and master (as of 8/21) with various pe and backend counts with the same outcome as shown below:

Starting PLFS on ct-login1.localdomain:/users/atorrez/plfs.atorrez/lscratch1/n1 /users/atorrez/plfs.atorrez/lscratch1/n1/atorrez/Aug22075645545091855.1.w.dat w

Index of /lustre/lscratch1/atorrez/.plfs_store_n1_1//atorrez/Aug22075645545091855.1.w.dat

Data Droppings

0 /lustre/lscratch1/atorrez/.plfs_store_n1_1//atorrez/Aug22075645545091855.1.w.dat/hostdir.33//dropping.data.1377179805.564607.ct-login1.localdomain.28671

Entry Count: 1

ID Logical_offset Length Begin_timestamp End_timestamp Logical_tail ID.Chunk_offset

0 w                0     3000 1377179805.5659279823303223 1377179805.5662109851837158             2999  [0.         0]

/users/atorrez/plfs.atorrez/lscratch1/n1/atorrez/Aug22075649597507779.16.w.dat w

Index of /lustre/lscratch1/atorrez/.plfs_store_n1_1//atorrez/Aug22075649597507779.16.w.dat

Data Droppings

0 /lustre/lscratch1/atorrez/.plfs_store_n1_1//atorrez/Aug22075649597507779.16.w.dat/hostdir.33//dropping.data.1377179809.603314.ct-login1.localdomain.28675

Entry Count: 1

ID Logical_offset Length Begin_timestamp End_timestamp Logical_tail ID.Chunk_offset

0 w                0     3000 1377179809.6044900417327881 1377179809.6047160625457764             2999  [0.         0]

Note: I modified the script to handle mounting and unmounting of plfs and removed the aprun. I then logged into an internal login node and arpun'd that script. The script is shown below:

!/bin/bash

plfs /users/atorrez/plfs.atorrez/lscratch1/n1 filemodelist="w" nplist="1 16"

lustredir=/lustre/lscratch/jun/dumps

plfsdir_n1=/plfs/lscratch_n1/jun/milcdumps

plfsdir_n1=/users/atorrez/plfs.atorrez/lscratch1/n1/atorrez dirlist="$plfsdir_n1" for filemode in $filemodelist do for filedir in $dirlist do for np in $nplist do runid=date '+%b%d%H%M%S%N' filename=$runid.$np.$filemode.dat filepath=$filedir/$filename echo $filepath $filemode ./fopen.x $filepath $filemode plfs_map $filepath sleep 4 done done done fusermount -u /users/atorrez/plfs.atorrez/lscratch1/n1

atorrez commented 11 years ago

Closing this because cannot reproduce.