Open calestyo opened 2 years ago
Hi Chris,
have you changes billing format?
$ ./dcache/sbin/dcache-billing-indexer -all
Indexing /home/tigran/eProjects/dcache/packages/system-test/target/dcache/var/billing/2022/02/billing-2022.02.16
Indexing /home/tigran/eProjects/dcache/packages/system-test/target/dcache/var/billing/2022/02/billing-2022.02.17
$
Hey.
Yes I did,... but shouldn't it pick that up automatically?
Cheers, Chris
There are two places: billing format and indexer format:
billing.text.format.xxx=
billing.parser.format
Did you have updated both?
I have set the later to the former via:
billing.parser.format!door-request-info-message=${billing.text.format.door-request-info-message}
billing.parser.format!pool-hit-info-message=${billing.text.format.pool-hit-info-message}
billing.parser.format!storage-info-message=${billing.text.format.storage-info-message}
billing.parser.format!mover-info-message=${billing.text.format.mover-info-message}
billing.parser.format!remove-file-info-message=${billing.text.format.remove-file-info-message}
billing.parser.format!warning-pnfs-file-info-message=${billing.text.format.warning-pnfs-file-info-message}
I guess that option way doesn't work :)
Uhm? So it's a bug, or do I misuse it? :D
I don't remember the details, however, if it was that simple to we haven't define it twice...
Just to clarify -- in case there's some confusion.
The configuration properties that start (billing.text.format.
) describe how new records are to be written. You should be able to update these configuration property values to customise what information is recorded, and how that information is represented.
FWIW, I think it's very unlikely we (dCache.org) will ever modify the default values. The risk is too great that we inadvertently break a parser written by some third-party.
Since dCache v3.0.0 (released ~November 2016), the billing files contain special comment lines that define the format used to the different record types. These are lines that start with a double-comment sequence (##
). Parsers can use this to learn which format billing was configured to use when parsing lines.
The dcache-billing-indexer
command should use these comments when parsing the file.
However, one might want to parse pre-3.0.0 billing files: those written before support for the double-comment sequence was introduced. The billing.parser.format
family of configuration properties allows you to configure how the parser should understand the records if there are no lines starting ##
.
If you're parsing lines written by dCache v3.0.0 or later, the billing.parser.format
family of configuration properties should have no effect, as the billing files should be self-describing.
Hey.
Since dCache v3.0.0 (released ~November 2016)
All our currently available billing files range back to 2021-01-01, where we already ran something way beyond 3.x ... and the header lines you mention are in place.
So what you describe shouldn't apply to us anyway, and thus the bug is likely somewhere else?
Cheers, Chris.
Hi Chris,
My take: if changing the billing.parser.format
family of configuration properties fixes a problem parsing billing files written post v3.0.0 then there's a bug somewhere.
So, did modifying billing.parser.format
"fix" the problem?
It wasn't clear to me from your description.
Also, just to eliminate something: when you mentioned that you changed the format, you did this in dcache.conf
or the layout file, right? You didn't edit the files in the /usr/share/dcache/defaults
directory.
Cheers, Paul.
Uhm... I'm a bit confused now ^^
So what I did was the following: I changed our previously set and already custom:
billing.text.format.mover-info-message=$date; format="${lmu.miscellaneous.date-time-format}"$$$$\\t$mMsg:$\\t$[$cellType$:$cellName.cell$@$cellName.domain$:$type$]$\\t$[$session$]$\\t$[$pnfsid$:$path$]$\\t$$$$filesize$B$\\t$[$if(storage)$$$$storage.storageClass$@$storage.hsm$$$$else$<unknown>$endif$]$\\t$[$subject.loginName$]$\\t$[$subject.dn$]$\\t$[[$subject.primaryFqan$]:[$subject.fqans; separator="|"$]]$\\t$[$subject.userName$]$\\t$[$subject.uid$]$\\t$[$subject.primaryGid$:$subject.gids; separator="|"$]$\\t$$$$queuingTime$ms$\\t$[$protocol$]$\\t$[$initiator$]$\\t$$$$if(p2p)$p2p$else$no-p2p$endif$$$$\\t$$$$if(created)$upload$else$download$endif$$$$\\t$$$$transferred$B$\\t$$$$meanReadBandwidth$MiB/s$\\t$$$$meanWriteBandwidth$MiB/s$\\t$$$$connectionTime$ms$\\t$$$$readActive$ms$\\t$$$$readIdle$ms$\\t$$$$writeActive$ms$\\t$$$$writeIdle$ms$\\t$[$transferPath$]$\\t$[$rc$:"$message$"]
to:
billing.text.format.mover-info-message=$date; format="${lmu.miscellaneous.date-time-format}"$$$$\\t$mMsg:$\\t$[$cellType$:$cellName.cell$@$cellName.domain$:$type$]$\\t$[$session$]$\\t$[$pnfsid$:$path$]$\\t$$$$filesize$B$\\t$[$if(storage)$$$$storage.storageClass$@$storage.hsm$$$$else$<unknown>$endif$]$\\t$[$subject.loginName$]$\\t$[$subject.dn$]$\\t$[[$subject.primaryFqan$]:[$subject.fqans; separator="|"$]]$\\t$[$subject.userName$]$\\t$[$subject.uid$]$\\t$[$subject.primaryGid$:$subject.gids; separator="|"$]$\\t$$$$queuingTime$ms$\\t$[$protocol$]$\\t$[$initiator$]$\\t$$$$if(p2p)$p2p$else$no-p2p$endif$$$$\\t$$$$if(created)$upload$else$download$endif$$$$\\t$$$$transferred$B$\\t$$$$meanReadBandwidth$B/s$\\t$$$$meanWriteBandwidth$B/s$\\t$$$$connectionTime$ms$\\t$$$$readActive$ms$\\t$$$$readIdle$ms$\\t$$$$writeActive$ms$\\t$$$$writeIdle$ms$\\t$[$transferPath$]$\\t$[$rc$:"$message$"]
The only difference being the string literal B
changed to MiB
,... there was a documentation error in dCache and that was fixed a while ago so I adapted that.
Admittedly, I only started checking the cron mails recently... so it might very well be, that even the old setting gave already errors.
So, did modifying billing.parser.format "fix" the problem?
What exactly do you mean with "modifying"? Literally setting the value to billing.parser.format!*
instead of via variable assignment?
As in:
billing.parser.format!door-request-info-message=$date; format="${lmu.miscellaneous.date-time-format}"$$$$\\t$drMsg:$\\t$[$cellType$:$cellName.cell$@$cellName.domain$:$type$]$\\t$[$session$]$\\t$[$pnfsid$:$path$]$\\t$$$$filesize$B$\\t$[$if(storage)$$$$storage.storageClass$@$storage.hsm$$$$else$<unknown>$endif$]$\\t$[$subject.loginName$]$\\t$[$subject.dn$]$\\t$[[$subject.primaryFqan$]:[$subject.fqans; separator="|"$]]$\\t$[$subject.userName$]$\\t$[$subject.uid$]$\\t$[$subject.primaryGid$:$subject.gids; separator="|"$]$\\t$$$$queuingTime$ms$\\t$[$clientChain$]$\\t$$$$transactionTime$ms$\\t$[$transferPath$]$\\t$[$rc$:"$message$"]
billing.parser.format!pool-hit-info-message=$date; format="${lmu.miscellaneous.date-time-format}"$$$$\\t$phMsg:$\\t$[$cellType$:$cellName.cell$@$cellName.domain$:$type$]$\\t$[$session$]$\\t$[$pnfsid$:$path$]$\\t$$$$filesize$B$\\t$[$if(storage)$$$$storage.storageClass$@$storage.hsm$$$$else$<unknown>$endif$]$\\t$[$subject.loginName$]$\\t$[$subject.dn$]$\\t$[[$subject.primaryFqan$]:[$subject.fqans; separator="|"$]]$\\t$[$subject.userName$]$\\t$[$subject.uid$]$\\t$[$subject.primaryGid$:$subject.gids; separator="|"$]$\\t$$$$queuingTime$ms$\\t$[$protocol$]$\\t$$$$if(cached)$cached$else$not-cached$endif$$$$\\t$[$transferPath$]$\\t$[$rc$:"$message$"]
billing.parser.format!storage-info-message=$date; format="${lmu.miscellaneous.date-time-format}"$$$$\\t$sMsg:$\\t$[$cellType$:$cellName.cell$@$cellName.domain$:$type$]$\\t$[$session$]$\\t$[$pnfsid$:$path$]$\\t$$$$filesize$B$\\t$[$if(storage)$$$$storage.storageClass$@$storage.hsm$$$$else$<unknown>$endif$]$\\t$[$subject.loginName$]$\\t$[$subject.dn$]$\\t$[[$subject.primaryFqan$]:[$subject.fqans; separator="|"$]]$\\t$[$subject.userName$]$\\t$[$subject.uid$]$\\t$[$subject.primaryGid$:$subject.gids; separator="|"$]$\\t$$$$queuingTime$ms$\\t$$$$transferTime$ms$\\t$[$rc$:"$message$"]
billing.parser.format!mover-info-message=$date; format="${lmu.miscellaneous.date-time-format}"$$$$\\t$mMsg:$\\t$[$cellType$:$cellName.cell$@$cellName.domain$:$type$]$\\t$[$session$]$\\t$[$pnfsid$:$path$]$\\t$$$$filesize$B$\\t$[$if(storage)$$$$storage.storageClass$@$storage.hsm$$$$else$<unknown>$endif$]$\\t$[$subject.loginName$]$\\t$[$subject.dn$]$\\t$[[$subject.primaryFqan$]:[$subject.fqans; separator="|"$]]$\\t$[$subject.userName$]$\\t$[$subject.uid$]$\\t$[$subject.primaryGid$:$subject.gids; separator="|"$]$\\t$$$$queuingTime$ms$\\t$[$protocol$]$\\t$[$initiator$]$\\t$$$$if(p2p)$p2p$else$no-p2p$endif$$$$\\t$$$$if(created)$upload$else$download$endif$$$$\\t$$$$transferred$B$\\t$$$$meanReadBandwidth$B/s$\\t$$$$meanWriteBandwidth$B/s$\\t$$$$connectionTime$ms$\\t$$$$readActive$ms$\\t$$$$readIdle$ms$\\t$$$$writeActive$ms$\\t$$$$writeIdle$ms$\\t$[$transferPath$]$\\t$[$rc$:"$message$"]
billing.parser.format!remove-file-info-message=$date; format="${lmu.miscellaneous.date-time-format}"$$$$\\t$rfMsg:$\\t$[$cellType$:$cellName.cell$@$cellName.domain$:$type$]$\\t$[$session$]$\\t$[$pnfsid$:$path$]$\\t$$$$filesize$B$\\t$[$if(storage)$$$$storage.storageClass$@$storage.hsm$$$$else$<unknown>$endif$]$\\t$[$subject.loginName$]$\\t$[$subject.dn$]$\\t$[[$subject.primaryFqan$]:[$subject.fqans; separator="|"$]]$\\t$[$subject.userName$]$\\t$[$subject.uid$]$\\t$[$subject.primaryGid$:$subject.gids; separator="|"$]$\\t$$$$queuingTime$ms$\\t$[$rc$:"$message$"]
billing.parser.format!warning-pnfs-file-info-message=$date; format="${lmu.miscellaneous.date-time-format}"$$$$\\t$wpfMsg:$\\t$[$cellType$:$cellName.cell$@$cellName.domain$:$type$]$\\t$[$session$]$\\t$[$pnfsid$:$path$]$\\t$$$$filesize$B$\\t$[$if(storage)$$$$storage.storageClass$@$storage.hsm$$$$else$<unknown>$endif$]$\\t$[$subject.loginName$]$\\t$[$subject.dn$]$\\t$[[$subject.primaryFqan$]:[$subject.fqans; separator="|"$]]$\\t$[$subject.userName$]$\\t$[$subject.uid$]$\\t$[$subject.primaryGid$:$subject.gids; separator="|"$]$\\t$$$$queuingTime$ms$\\t$[$transferPath$]$\\t$[$rc$:"$message$"]
?
Just tried that, and still leads to:
# /etc/cron.daily/dcache
capturing group name does not start with a Latin letter near index 15
(?<date>.*?)(?<\t>.*?)\QsMsg:\E(?<\t>.*?)\Q[\E(?<cellType>pool)\Q:\E(?<cellNameXcell>.*?)\Q@\E(?<cellNameXdomain>.*?)\Q:\E(?<type>(?:re)?store)\Q]\E(?<\t>.*?)\Q[\E(?<session>.*?)\Q]\E(?<\t>.*?)\Q[\E(?<pnfsid>[0-9A-F]{24}(?:[0-9A-F]{12})?)\Q:\E(?<path>.*?)\Q]\E(?<\t>.*?)(?<filesize>-?\d+)\QB\E(?<\t>.*?)\Q[\E(?:(?<storageXstorageClass>.*?)\Q@\E(?<storageXhsm>.*?)|\Q<unknown>\E)\Q]\E(?<\t>.*?)\Q[\E(?<subjectXloginName>.*?)\Q]\E(?<\t>.*?)\Q[\E(?<subjectXdn>.*?)\Q]\E(?<\t>.*?)\Q[[\E(?<subjectXprimaryFqan>.*?)\Q]:[\E(?<subjectXfqans>.*?)\Q]]\E(?<\t>.*?)\Q[\E(?<subjectXuserName>.*?)\Q]\E(?<\t>.*?)\Q[\E(?<subjectXuid>.*?)\Q]\E(?<\t>.*?)\Q[\E(?<subjectXprimaryGid>.*?)\Q:\E(?<subjectXgids>.*?)\Q]\E(?<\t>.*?)(?<queuingTime>-?\d+)\Qms\E(?<\t>.*?)(?<transferTime>-?\d+)\Qms\E(?<\t>.*?)\Q[\E(?<rc>-?\d+)\Q:"\E(?<message>.*?)\Q"]\E
^
COMMANDS:
-all [-fpp=PROP] [-dir=BASE]
(Re)index all billing files.
-compress FILE...
Compress FILE.
-decompress FILE...
Decompress FILE.
-find [-files|-json|-yaml] [-dir=BASE] [-since=DATE] [-until=DATE] [-f=FILE] [SEARCHTERM]...
Output billing entries that contain SEARCHTERM. Valid search terms are
path, pnfsid, dn and path prefixes of those. Optionally output names
of billing files that might contain the search term. If no search term
is provided, all entries are output.
-index [-fpp=PROP] FILE...
Create index for FILE.
-yesterday [-compress] [-fpp=PROP] [-dir=BASE] [-flat=BOOL]
Index yesterday's billing file. Optionally compresses the billing file
after indexing it.
OPTIONS:
-dir=BASE
Base directory for billing files. Default is taken from dCache
configuration.
-flat=BOOLEAN
Chooses between flat or hierarchical directory layout. Default is
taken from dCache configuration.
-fpp=PROP
The false positive probability expressed as a value in (0;1]. The
default is 0.01.
And I did that in dcache.conf
.
I do in fact modify some of the defaults files (because of the long standing issue #3309), too, but that should be completely unrelated.
Cheers, Chris.
Anything new on this? Still fails 9.1.2, and it's quite clearly an issue in how the regexp is generated: ?<\t>
is not a valid capture group name.
Hey.
In 7.2.10
/usr/sbin/dcache-billing-indexer
as invoked by/etc/cron.daily/dcache
fails like that:Cheers, Chris