Closed hamar closed 12 years ago
what do you want? we are not asking SEs to the checksum checking on every transfer, could you evaluate the overhead that this means and guarantee that this will not overload any of our SEs?
Did you check what all this files have in common. Since they are all concentrated at a single site this likely points to a problem related to that site.
Please assign me to this. All development for this is in my branch: https://github.com/KrzysztofCiba/DIRAC/tree/DEV-DMS-checksums-check, will be ready for testing soon.
In prod since a while, this can be closed.
In production already
Hi,
At CC in Lyon, we have some tickets about corrupted files copied from WNs to SEs, like:
/lhcb/LHCb/Collision11/CHARMCOMPLETEEVENT.DST/00016764/0001/00016764_00012227_1.CharmCompleteEvent.dst /lhcb/LHCb/Collision11/CHARMCOMPLETEEVENT.DST/00016764/0001/00016764_00010674_1.CharmCompleteEvent.dst /lhcb/LHCb/Collision11/CHARMCOMPLETEEVENT.DST/00016764/0001/00016764_00010644_1.CharmCompleteEvent.dst /lhcb/LHCb/Collision11/DIMUON.DST/00016764/0001/00016764_00013474_1.Dimuon.dst /lhcb/LHCb/Collision11/RADIATIVE.DST/00016764/0001/00016764_00016701_1.Radiative.dst /lhcb/LHCb/Collision11/RADIATIVE.DST/00016764/0001/00016764_00018376_1.Radiative.dst /lhcb/LHCb/Collision11/RADIATIVE.DST/00016764/0001/00016764_00011409_1.Radiative.dst /lhcb/LHCb/Collision11/RADIATIVE.DST/00016764/0001/00016764_00014025_1.Radiative.dst /lhcb/LHCb/Collision11/RADIATIVE.DST/00016764/0002/00016764_00025515_1.Radiative.dst /lhcb/LHCb/Collision11/SEMILEPTONIC.DST/00016764/0001/00016764_00010704_1.Semileptonic.dst /lhcb/LHCb/Collision11/SEMILEPTONIC.DST/00016764/0001/00016764_00012248_1.Semileptonic.dst /lhcb/LHCb/Collision11/SEMILEPTONIC.DST/00016764/0001/00016764_00010622_1.Semileptonic.dst /lhcb/LHCb/Collision11/SEMILEPTONIC.DST/00016764/0001/00016764_00010602_1.Semileptonic.dst /lhcb/LHCb/Collision11/SEMILEPTONIC.DST/00016764/0001/00016764_00010653_1.Semileptonic.dst /lhcb/LHCb/Collision11/SEMILEPTONIC.DST/00016764/0001/00016764_00010593_1.Semileptonic.dst /lhcb/LHCb/Collision11/SEMILEPTONIC.DST/00016764/0001/00016764_00014037_1.Semileptonic.dst /lhcb/LHCb/Collision11/SEMILEPTONIC.DST/00016764/0002/00016764_00029737_1.Semileptonic.dst /lhcb/LHCb/Collision11/BHADRON.DST/00016773/0001/00016773_00016759_1.Bhadron.dst /lhcb/LHCb/Collision11/BHADRON.DST/00016773/0002/00016773_00020476_1.Bhadron.dst /lhcb/LHCb/Collision11/CHARM.MDST/00016773/0004/00016773_00040644_1.Charm.mdst /lhcb/LHCb/Collision11/CHARM.MDST/00016773/0003/00016773_00035207_1.Charm.mdst /lhcb/LHCb/Collision11/CHARMCOMPLETEEVENT.DST/00016773/0001/00016773_00012899_1.CharmCompleteEvent.dst /lhcb/LHCb/Collision11/CHARMCOMPLETEEVENT.DST/00016773/0001/00016773_00018334_1.CharmCompleteEvent.dst /lhcb/LHCb/Collision11/CHARMCOMPLETEEVENT.DST/00016773/0001/00016773_00016786_1.CharmCompleteEvent.dst /lhcb/LHCb/Collision11/CHARMCOMPLETEEVENT.DST/00016773/0004/00016773_00041015_1.CharmCompleteEvent.dst /lhcb/LHCb/Collision11/CHARMCOMPLETEEVENT.DST/00016773/0003/00016773_00033543_1.CharmCompleteEvent.dst /lhcb/LHCb/Collision11/DIMUON.DST/00016773/0001/00016773_00016844_1.Dimuon.dst /lhcb/LHCb/Collision11/DIMUON.DST/00016773/0001/00016773_00016823_1.Dimuon.dst /lhcb/LHCb/Collision11/DIMUON.DST/00016773/0001/00016773_00016718_1.Dimuon.dst /lhcb/LHCb/Collision11/DIMUON.DST/00016773/0003/00016773_00035584_1.Dimuon.dst /lhcb/LHCb/Collision11/DIMUON.DST/00016773/0003/00016773_00033633_1.Dimuon.dst /lhcb/LHCb/Collision11/DIMUON.DST/00016773/0003/00016773_00035094_1.Dimuon.dst /lhcb/LHCb/Collision11/DIMUON.DST/00016773/0004/00016773_00041993_1.Dimuon.dst /lhcb/LHCb/Collision11/DIMUON.DST/00016773/0003/00016773_00039610_1.Dimuon.dst /lhcb/LHCb/Collision11/DIMUON.DST/00016773/0003/00016773_00032521_1.Dimuon.dst /lhcb/LHCb/Collision11/RADIATIVE.DST/00016773/0001/00016773_00016864_1.Radiative.dst /lhcb/LHCb/Collision11/RADIATIVE.DST/00016773/0001/00016773_00016836_1.Radiative.dst /lhcb/LHCb/Collision11/RADIATIVE.DST/00016773/0003/00016773_00035891_1.Radiative.dst /lhcb/LHCb/Collision11/RADIATIVE.DST/00016773/0003/00016773_00035158_1.Radiative.dst /lhcb/LHCb/Collision11/RADIATIVE.DST/00016773/0003/00016773_00035101_1.Radiative.dst /lhcb/LHCb/Collision11/RADIATIVE.DST/00016773/0003/00016773_00035749_1.Radiative.dst /lhcb/LHCb/Collision11/RADIATIVE.DST/00016773/0004/00016773_00042171_1.Radiative.dst /lhcb/LHCb/Collision11/RADIATIVE.DST/00016773/0003/00016773_00036531_1.Radiative.dst /lhcb/LHCb/Collision11/SEMILEPTONIC.DST/00016773/0001/00016773_00011716_1.Semileptonic.dst /lhcb/LHCb/Collision11/SEMILEPTONIC.DST/00016773/0001/00016773_00016777_1.Semileptonic.dst /lhcb/LHCb/Collision11/SEMILEPTONIC.DST/00016773/0002/00016773_00020826_1.Semileptonic.dst /lhcb/LHCb/Collision11/SEMILEPTONIC.DST/00016773/0003/00016773_00036594_1.Semileptonic.dst /lhcb/LHCb/Collision11/BHADRON.DST/00016992/0000/00016992_00002944_1.Bhadron.dst /lhcb/LHCb/Collision11/BHADRON.DST/00016992/0000/00016992_00008252_1.Bhadron.dst /lhcb/LHCb/Collision11/BHADRON.DST/00016992/0000/00016992_00006653_1.Bhadron.dst /lhcb/LHCb/Collision11/DIMUON.DST/00016992/0000/00016992_00002953_1.Dimuon.dst /lhcb/LHCb/Collision11/DIMUON.DST/00016992/0000/00016992_00003455_1.Dimuon.dst /lhcb/LHCb/Collision11/DIMUON.DST/00016992/0000/00016992_00004044_1.Dimuon.dst /lhcb/LHCb/Collision11/DIMUON.DST/00016992/0000/00016992_00005205_1.Dimuon.dst /lhcb/LHCb/Collision11/DIMUON.DST/00016992/0000/00016992_00001059_1.Dimuon.dst /lhcb/LHCb/Collision11/PID.MDST/00016992/0000/00016992_00008153_1.PID.mdst /lhcb/LHCb/Collision11/RADIATIVE.DST/00016992/0000/00016992_00001222_1.Radiative.dst /lhcb/LHCb/Collision11/RADIATIVE.DST/00016992/0000/00016992_00001838_1.Radiative.dst /lhcb/LHCb/Collision11/SEMILEPTONIC.DST/00016992/0000/00016992_00000817_1.Semileptonic.dst /lhcb/LHCb/Collision11/SEMILEPTONIC.DST/00016992/0000/00016992_00000992_1.Semileptonic.dst /lhcb/LHCb/Collision11/SEMILEPTONIC.DST/00016992/0000/00016992_00001214_1.Semileptonic.dst
I was looking into SRM2Storage file and I found:
if localSize == remoteSize: gLogger.debug( "SRM2Storage.getFile: Post transfer check successful." ) errorMessage = "SRM2Storage.getFile: Source and destination file sizes do not match."
no checksum :(
The portal server than they are using is:
https://lhcb-web-dirac.cern.ch/DIRAC/LHCb-Production/lhcb_prod/jobs/PilotMonitor/display
Thanks in advance,
Vanessa
errorMessage = "SRM2Storage.__getFile: Source and destination file sizes do not match."