Closed tashrifbillah closed 3 months ago
Keep in mind the depth factor. Data inside the two folders are many levels deep. If diskusage logger is reporting lower sizes because we asked it to explore up to a lower depth i.e. fewer files, I would like you to investigate and confirm.
In logdirsizes, the du command is used as such:
du --time -b --max-depth $depth $dir
I ran this command and it produced similar output to the table inside report-data-20240629.html
. I will read up on the du command to try diagnose this discrepancy.
Difference is usually caused by power of 1024 vs 1000. The command I used uses power of 1000.
Posting for record:
-b, --bytes
equivalent to '--apparent-size --block-size=1'
--apparent-size
print apparent sizes, rather than disk usage; although the apparent size is
usually smaller, it may be larger due to holes in ('sparse') files, internal
fragmentation, indirect blocks, and the like
--si like -h, but use powers of 1000 not 1024
-h, --human-readable
print sizes in human readable format (e.g., 1K 234M 2G)
I read the documentation more. It seems that --apparent-size
is making the difference between my output and diskusage-logger's output. Also, I remember we adopted du --si -sh
to match with ERIS' billing long ago.
Hi @cjennings, in PR #13 , we worked on appearance so far. Now, let's talk algorithm. Below is the report of
/data/predict1/data_from_nda/Pronet
and/data/predict1/data_from_nda/Prescient
folders. The 15 and 9 TB sizes appear in two places inreport-data-20240629.html
.But a manual
df -h --si
reports much bigger sizes:Given the same command is used in diskusage-logging program, can you look into the mismatch? Is your addition algorithm missing soemthing?