dCache / dcache

dCache - a system for storing and retrieving huge amounts of data, distributed among a large number of heterogenous server nodes, under a single virtual filesystem tree with a variety of standard access methods
https://dcache.org
276 stars 134 forks source link

new post 2.13 options/changes not mentioned in the release notes #2321

Open calestyo opened 8 years ago

calestyo commented 8 years ago

Hi.

During diffing the 2.13 defaults files with the 2.15.3 defaults files (which proves unfortunately still necessary as options are added or default values changed and this seems not documented in the release notes :-( at least in few cases) I noticed a number of undocumented stuff, with the following having potentially bigger impact and should therefore be mentioned in the release notes of the upcoming next release:

Most grave: billing.text.format.door-request-info-message changed the default value, replacing a $client$ with $clientChain$ which, AFAIU, actually leads to different output.

I'm actually in favour, that you do change the default when you add new fileds, I also think you should have added p2p to the default when that got introduced. If sites really want to stick with the old format, then those should need to change their settings. But it should of course only happen at a new major version and(!) be documented.

gplazma.x509.use-policy-principals I've already mentioned in a separate ticket...

There are further webdav.enable.owncloud and webdav.owncloud.door which do not really seem to be documented at all (neither the release notes, nor the defaults file). Also I'm a bit sceptical that this is enabled per default? How many production sites actually run owncloud? Any? More enabled code = more potential attack surface.

There were a few others, but I since that didn't affect me/LMU I forgot to write them down :-(

Cheers, Chris.

gbehrmann commented 8 years ago

I'm actually in favour, that you do change the default when you add new fileds, I also think you should have added p2p to the default when that got introduced. If sites really want to stick with the old format, then those should need to change their settings. But it should of course only happen at a new major version and(!) be documented.

Once we add support to the indexer to support multiple generations of formats, we can start to modify the defaults. Until then it is important to maintain the current format as otherwise the indexer breaks.

I wasn't aware of the above change (I knew about the feature to look at the forward header, but I hadn't noticed the patch changed the billing format) and this is clearly a policy violation. In this particular case the indexer however doesn't break as the old format is a "subset" of the new and thus old entries can still be parsed. I will ask for the release notes to be updated to list the incompatibility.

There are further webdav.enable.owncloud and webdav.owncloud.door which do not really seem to be documented at all (neither the release notes, nor the defaults file).

So in one ticket you complain about new code not being enabled by default and in another ticket you complain about new code being enabled by default?

It isn't documented yet because the Owncloud sync support isn't complete - upload doesn't work for large files yet. Should the change have waited until support is complete? Maybe. But the authors of this change use this feature and they add significant funding to the development of dCache.

calestyo commented 8 years ago

Once we add support to the indexer to support multiple generations of formats, we can start to modify the defaults. Until then it is important to maintain the current format as otherwise the indexer breaks.

Would it be possible to make the indexer work generically based on the configured format string? I have for example massively changed the format strings for easier (in the sense of: easier as how I'm used to do it ^^) parsing/grepping/etc with shell tools to this:

billing.text.format.door-request-info-message=$date; format="${lmu.miscellaneous.date-time-format}"$$$$\\t$drMsg:$\\t$[$cellType$:$cellName$:$type$]$\\t$[$session$]$\\t$[$pnfsid$:$path$]$\\t$$$$filesize$B$\\t$[$if(storage)$$$$storage.storageClass$@$storage.hsm$$$$else$<unknown>$endif$]$\\t$[$subject.loginName$]$\\t$[$subject.dn$]$\\t$[[$subject.primaryFqan$]:[$subject.fqans; separator="|"$]]$\\t$[$subject.userName$]$\\t$[$subject.uid$]$\\t$[$subject.primaryGid$:$subject.gids; separator="|"$]$\\t$$$$queuingTime$ms$\\t$$$$client$$$$\\t$$$$transactionTime$ms$\\t$[$transferPath$]$\\t$[$rc$:"$message$"]
billing.text.format.pool-hit-info-message=$date; format="${lmu.miscellaneous.date-time-format}"$$$$\\t$phMsg:$\\t$[$cellType$:$cellName$:$type$]$\\t$[$session$]$\\t$[$pnfsid$:$path$]$\\t$$$$filesize$B$\\t$[$if(storage)$$$$storage.storageClass$@$storage.hsm$$$$else$<unknown>$endif$]$\\t$[$subject.loginName$]$\\t$[$subject.dn$]$\\t$[[$subject.primaryFqan$]:[$subject.fqans; separator="|"$]]$\\t$[$subject.userName$]$\\t$[$subject.uid$]$\\t$[$subject.primaryGid$:$subject.gids; separator="|"$]$\\t$$$$queuingTime$ms$\\t$[$protocol$]$\\t$$$$if(cached)$cached$else$not-cached$endif$$$$\\t$[$transferPath$]$\\t$[$rc$:"$message$"]
billing.text.format.storage-info-message=$date; format="${lmu.miscellaneous.date-time-format}"$$$$\\t$sMsg:$\\t$[$cellType$:$cellName$:$type$]$\\t$[$session$]$\\t$[$pnfsid$:$path$]$\\t$$$$filesize$B$\\t$[$if(storage)$$$$storage.storageClass$@$storage.hsm$$$$else$<unknown>$endif$]$\\t$[$subject.loginName$]$\\t$[$subject.dn$]$\\t$[[$subject.primaryFqan$]:[$subject.fqans; separator="|"$]]$\\t$[$subject.userName$]$\\t$[$subject.uid$]$\\t$[$subject.primaryGid$:$subject.gids; separator="|"$]$\\t$$$$queuingTime$ms$\\t$$$$transferTime$ms$\\t$[$rc$:"$message$"]
billing.text.format.mover-info-message=$date; format="${lmu.miscellaneous.date-time-format}"$$$$\\t$mMsg:$\\t$[$cellType$:$cellName$:$type$]$\\t$[$session$]$\\t$[$pnfsid$:$path$]$\\t$$$$filesize$B$\\t$[$if(storage)$$$$storage.storageClass$@$storage.hsm$$$$else$<unknown>$endif$]$\\t$[$subject.loginName$]$\\t$[$subject.dn$]$\\t$[[$subject.primaryFqan$]:[$subject.fqans; separator="|"$]]$\\t$[$subject.userName$]$\\t$[$subject.uid$]$\\t$[$subject.primaryGid$:$subject.gids; separator="|"$]$\\t$$$$queuingTime$ms$\\t$[$protocol$]$\\t$[$initiator$]$\\t$$$$if(p2p)$p2p$else$no-p2p$endif$$$$\\t$$$$if(created)$upload$else$download$endif$$$$\\t$$$$transferred$B$\\t$$$$connectionTime$ms$\\t$[$transferPath$]$\\t$[$rc$:"$message$"]
billing.text.format.remove-file-info-message=$date; format="${lmu.miscellaneous.date-time-format}"$$$$\\t$rfMsg:$\\t$[$cellType$:$cellName$:$type$]$\\t$[$session$]$\\t$[$pnfsid$:$path$]$\\t$$$$filesize$B$\\t$[$if(storage)$$$$storage.storageClass$@$storage.hsm$$$$else$<unknown>$endif$]$\\t$[$subject.loginName$]$\\t$[$subject.dn$]$\\t$[[$subject.primaryFqan$]:[$subject.fqans; separator="|"$]]$\\t$[$subject.userName$]$\\t$[$subject.uid$]$\\t$[$subject.primaryGid$:$subject.gids; separator="|"$]$\\t$$$$queuingTime$ms$\\t$[$rc$:"$message$"]

I wasn't aware of the above change (I knew about the feature to look at the forward header, but I hadn't noticed the patch changed the billing format) and this is clearly a policy violation. In this particular case the indexer however doesn't break as the old format is a "subset" of the new and thus old entries can still be parsed. I will ask for the release notes to be updated to list the incompatibility.

Please take care, that these notes are added to the next new release (as well)... because people who have already updated, likely won't pick up any changes to release notes of past major or minor versions. :-)

So in one ticket you complain about new code not being enabled by default and in another ticket you complain about new code being enabled by default?

Well I think there's a difference in both cases... the gplazma case seems, AFAIU, add new functionality, which is most likely useful to everyone. The owncloud thingy just adds functionality for those combining dcache with owncloud.

It isn't documented yet because the Owncloud sync support isn't complete - upload doesn't work for large files yet. Should the change have waited until support is complete? Maybe. But the authors of this change use this feature and they add significant funding to the development of dCache.

I don't think it's a problem to add a not yet fully finished feature,... the "problem" (not that it would be a really big issue ^^) is that it's likely not used by the majority so why enabling it? We e.g. also don't enable XACML and other niche plugins per default in gplazma.conf.

Cheers.

gbehrmann commented 8 years ago

The indexer is already generic - it is based upon the format string (to the extend that you made the format strings sufficiently distinguishable to allow the parser to reverse the serialization).

What the indexer does not yet support is that you have billing files in which the format changed at some point in time - ie where old billing files have one format and new billing files a different format (don't bother, we know how to implement that).

Please take care, that these notes are added to the next new release (as well)... because people who have already updated, likely won't pick up any changes to release notes of past major or minor versions.

Which at this point in time is a very very limited number of people.

The owncloud thingy just adds functionality for those combining dcache with owncloud.

That's new functionality to support the Owncloud sync client - it has nothing to do with Owncloud the server product. As for changing the default for disabled, you are welcome to create a github issue for it. The present github issue lists several complaints, so it is difficult to assign this sub-issue to anybody specific.

calestyo commented 8 years ago

as for the indexer: ah i see... and will that then also work for index files where the format string is no longer known, as it has changed in the past?

gbehrmann commented 8 years ago

That was the intention yes. The idea is to log the format strings as # comment lines to the billing files (once at the top of each file and once per restart). Thus the format becomes self describing.

The only issue is what to do with all the existing billing files.