accounting numbers are incorrect with compound resources

cookie33 commented 6 years ago

Hi,

We have a compound resource with files in both cache and archive resources. We want them counted only once per compound resource. The accounting script gives:

2017-12-20 07:00:16,217 - StorageAccounting                         - INFO - Storage space for collection: /vzSARA1/home/rods#replixRZG: 42
2017-12-20 07:00:16,390 - StorageAccounting                         - INFO - number of objects for collection /vzSARA1/home/rods#replixRZG: 14

The ils -r gives:

ils -r  /vzSARA1/home/rods#replixRZG
/vzSARA1/home/rods#replixRZG:
  1.txt
  2.txt
  3.txt
  4.txt
  5.txt
  6.txt
  7.txt
  8.txt
  t5.txt
  test00.txt
  test01.txt
  C- /vzSARA1/home/rods#replixRZG/1
/vzSARA1/home/rods#replixRZG/1:
  C- /vzSARA1/home/rods#replixRZG/1/11
/vzSARA1/home/rods#replixRZG/1/11:
  C- /vzSARA1/home/rods#replixRZG/1/11/111
/vzSARA1/home/rods#replixRZG/1/11/111:
  1.txt

So it are 12 files and not 14.

The ils -lr gives:

ils -rl  /vzSARA1/home/rods#replixRZG
/vzSARA1/home/rods#replixRZG:
  rods              1 eudat;eudatPnfs            2 2015-10-20.16:02 & 1.txt
  rods              1 eudat;eudatPnfs            3 2015-10-20.16:07 & 2.txt
  rods              1 eudat;eudatPnfs            3 2015-10-20.16:57 & 3.txt
  rods              1 eudat;eudatPnfs            3 2015-10-21.14:35 & 4.txt
  rods              0 eudat;eudatCache            3 2015-10-26.15:01 & 5.txt
  rods              1 eudat;eudatPnfs            3 2015-10-26.15:01 & 5.txt
  rods              1 eudat;eudatPnfs            3 2015-10-21.14:33 & 6.txt
  rods              1 eudat;eudatPnfs            5 2015-10-21.14:32 & 7.txt
  rods              1 eudat;eudatPnfs            3 2015-10-21.14:22 & 8.txt
  rods              0 demoResc            4 2015-12-04.17:45 & t5.txt
  rods              0 eudat;eudatCache            3 2015-10-20.12:33 & test00.txt
  rods              1 eudat;eudatPnfs            3 2015-10-20.16:01 & test01.txt
  C- /vzSARA1/home/rods#replixRZG/1  
/vzSARA1/home/rods#replixRZG/1:
  C- /vzSARA1/home/rods#replixRZG/1/11  
/vzSARA1/home/rods#replixRZG/1/11:
  C- /vzSARA1/home/rods#replixRZG/1/11/111  
/vzSARA1/home/rods#replixRZG/1/11/111:
  rods              0 eudat;eudatCache            2 2015-10-26.20:29 & 1.txt
  rods              1 eudat;eudatPnfs            2 2015-10-26.20:29 & 1.txt

There are 2 files with a copy in cache and in the archive resource, both part of the eudat compound resource. So they are counted twice. For bytes and file number.

We do not want this.

And because of this our numbers for some accounting metrics are off by more than 20TiB

Please fix this.

raphael-ritz commented 6 years ago

Hi Robert, not sure I fully understand the issue you are raising but as far as I can tell this has been discussed before and the agreement at the time was to do it like this. FWIW: I've only ported the accounting script that existed for quite some time already to work with the new accounting service without changing any logic.

I suggest to bring that up with Pavel and Johannes.

The discussion can continue here or elsewhere, I don't care. If there is consensus that something should change let me know.

BTW: using 'addRecord' - or just cURL - you can submit whatever you deem appropriate.

Cheers,

Raphael

cookie33 commented 6 years ago

HI,

This is why our accounting works as follows:

#!/bin/bash
#set -x
#

# source path's
for script in /etc/profile.d/*.sh ; do . $script ; done

# setup for user rods
HOME=/home/rods
cd ${HOME}

# parameters
curDateYMD=`date +%F`
curDate=`date +%s -d "$curDateYMD 23:00:00" `
zoneName=`/usr/bin/imiscsvrinfo | grep rodsZone | cut -f2 -d=`
# which filesystems to check
collectionNamelist=`ils /$zoneName | grep "C-" | grep -v -e trash -e replicate | sed 's/  C- //'`
# which resources to exclude
resourceList=`ilsresc | grep "compound" | sed 's/:compound//'`
if [ -n "${resourceList}"  ]
then 
    excludedResource=`ilsresc $resourceList | grep unix |  awk '{ print $2 }' | awk -F: -v a=$resourceList '{print a";"$1}'`
fi 

# where the data should be dumped to
logFile=${HOME}/resources/$curDate-$zoneName-usage.csv

# 
# do actual command, add also zone for each user.
#format: timestamp,zoneName,user#zone,collection,number of files,size(bytes)

rm -f $logFile 
for collectionName in $collectionNamelist
do
        if [ -n "${excludedResource}"  ]
        then
                # we have a resource group. So exclude the cache resource.
                iquest "$curDate,$zoneName,%s#%s,$collectionName,%s,%s" "SELECT DATA_OWNER_NAME,DATA_OWNER_ZONE,count(DATA_SIZE),sum(DATA_SIZE) WHERE COLL_NAME like '$collectionName/%' AND DATA_RESC_HIER NOT like '$excludedResource'" 2>/dev/null | sort >> $logFile 
        else
                # no resource group. All resources..
                iquest "$curDate,$zoneName,%s#%s,$collectionName,%s,%s" "SELECT DATA_OWNER_NAME,DATA_OWNER_ZONE,count(DATA_SIZE),sum(DATA_SIZE) WHERE COLL_NAME like '$collectionName/%'" 2>/dev/null | sort >> $logFile 
        fi
done

It puts it in a csv file. We later process the csv file and send the output to the original eudat accounting website.

EUDAT-DPMT / eudat.accounting.client

accounting numbers are incorrect with compound resources #6