anagainaru / HPC_IOpatterns

Extracting I/O patterns for different HPC application types.
1 stars 0 forks source link

Duplicate entries in the csv files #42

Open anagainaru opened 3 years ago

anagainaru commented 3 years ago

Find a way to ignore duplicate entries and keep only the ensemble jobs that share the same job id

$ cat Castro.csv | cut -d"," -f15,16 | sort -u | wc -l
    1049
$ cat Castro.csv | sort -u | wc -l
    1114
$ cat Castro.csv | wc -l
    1321