dCache / dcache

dCache - a system for storing and retrieving huge amounts of data, distributed among a large number of heterogenous server nodes, under a single virtual filesystem tree with a variety of standard access methods
https://dcache.org
291 stars 136 forks source link

Attribute is not defined: STORAGECLASS #7146

Open cfgamboa opened 1 year ago

cfgamboa commented 1 year ago

Hello

Alarms on doors dcdoor13 and dcdoor15 on high space and cpu usage

@vgaronne collected the following information

04 May 2023 06:17:07 (WebDAV2-dcdoor13-externalipv6) [] Failed to fetch information for progress marker: failed to query pool dcdoor20_1: Failed to deliver String message <5940928842082369868:-9083128157705912248> to [>dcdoor20_1@dcdoor20oneDomain]: l-AAX295SbNWA-<unknown>-AAX295Ssrgg@core-chimera03Domain is busy (its estimated response time of 2265 ms is longer than the message TTL of 2250 ms).
04 May 2023 06:17:07 (WebDAV2-dcdoor13-externalipv6) [] Failed to fetch information for progress marker: failed to query pool dcdoor20_1: Failed to deliver String message <-5468401769476308638:-4857535659056344874> to [>dcdoor20_1@dcdoor20oneDomain]: l-AAX295SbNWA-<unknown>-AAX295Ssrgg@core-chimera03Domain is busy (its estimated response time of 2290 ms is longer than the message TTL of 2250 ms).
04 May 2023 03:30:48 (WebDAV2-dcdoor13-internal) [door:WebDAV2-dcdoor13-internal@webdav2-dcdoor13_httpsDomain:AAX62SYlyPg] Internal server error: org.dcache.webdav.WebDavException: Request to [>SpaceManager@local:dc214_6@dc214sixDomain] timed out.
04 May 2023 07:51:45 (WebDAV2-dcdoor14-externalipv6) [] Failed to fetch information for progress marker: failed to query pool dcdoor11_1: Request to [>dcdoor11_1@dcdoor11oneDomain] timed out.
04 May 2023 07:51:47 (WebDAV2-dcdoor14-external) [] Failed to fetch information for progress marker: failed to query pool dcdoor11_1: Request to [>dcdoor11_1@dcdoor11oneDomain] timed out.
04 May 2023 07:51:47 (WebDAV2-dcdoor14-externalipv6) [] Failed to fetch information for progress marker: failed to query pool dcdoor11_1: Request to [>dcdoor11_1@dcdoor11oneDomain] timed out.
04 May 2023 07:51:49 (WebDAV2-dcdoor14-external) [] Failed to fetch information for progress marker: failed to query pool dcdoor11_1: Request to [>dcdoor11_1@dcdoor11oneDomain] timed out.
04 May 2023 07:51:49 (WebDAV2-dcdoor14-externalipv6) [] Failed to fetch information for progress marker: failed to query pool dcdoor11_1: Request to [>dcdoor11_1@dcdoor11oneDomain] timed out.
04 May 2023 07:51:51 (WebDAV2-dcdoor14-external) [] Failed to fetch information for progress marker: failed to query pool dcdoor11_1: Request to [>dcdoor11_1@dcdoor11oneDomain] timed out.
04 May 2023 07:51:52 (WebDAV2-dcdoor14-externalipv6) [] Failed to fetch information for progress marker: failed to query pool dcdoor11_1: Request to [>dcdoor11_1@dcdoor11oneDomain] timed out.

On the DMZ pool

04 May 2023 05:22:00 (dcdoor20_1) [door:WebDAV2-dcdoor06-externalipv6@webdav2-dcdoor06_httpsDomain:AAX62rGH5Og RemoteTransferManager PoolDeliverFile 00005D30B210EA9142DEBE2394520310A007] Unexpected failure during state change notification
java.lang.IllegalStateException: Attribute is not defined: STORAGECLASS
    at org.dcache.vehicles.FileAttributes.guard(FileAttributes.java:314)
    at org.dcache.vehicles.FileAttributes.getStorageClass(FileAttributes.java:627)
    at org.dcache.pool.migration.StorageClassFilter.test(StorageClassFilter.java:19)
    at org.dcache.pool.migration.StorageClassFilter.test(StorageClassFilter.java:9)
    at java.base/java.util.function.Predicate.lambda$and$0(Predicate.java:69)
    at java.base/java.util.function.Predicate.lambda$and$0(Predicate.java:69)
    at org.dcache.pool.migration.Job.accept(Job.java:606)
    at org.dcache.pool.migration.Job.stateChanged(Job.java:655)
    at org.dcache.pool.repository.v5.StateChangeListeners.lambda$stateChanged$0(StateChangeListeners.java:60)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
[root@dcdoor01 ~]# grep 'Attribute is not defined: STORAGECLASS'  /var/log/dcache/dcdoor01oneDomain.log |wc
    191    1146   13752

Thank you.

Carlos

DmitryLitvintsev commented 1 year ago

Let's concentrate on exception. Do they occur in the course of normal running or during migration?

cfgamboa commented 1 year ago

Hello,

On the DMZ pools migration is part of the normal operations, they are triggered every time there is a new file written to the system

All the best, Carlos

On Jun 20, 2023, at 10:30 AM, Dmitry Litvintsev @.***> wrote:

Let's concentrate on exception. Do they occur in the course of normal running or during migration?

— Reply to this email directly, view it on GitHub https://github.com/dCache/dcache/issues/7146#issuecomment-1598907301, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHIHMO2BSXU2NTLNLBRCEI3XMGXY5ANCNFSM6AAAAAAXVZL5HI. You are receiving this because you authored the thread.

lemora commented 1 year ago

Hi Carlos. Do you still observe these exceptions regularly? Which dCache version are you running currently?

cfgamboa commented 1 year ago

Hi Lea,

We are on 8.2.30 and still observed

01 Oct 2023 04:54:02 (dcdoor26_1) [door:WebDAV2-dcdoor24-externalipv6@webdav2-dcdoor24_httpsDomain:AAYGo75Kjmg dc212_5 PoolDeliverFile 00000C8739EAEE4C4E7D882624EF8CE85990] Unexpected failure during state change notification
java.lang.IllegalStateException: Attribute is not defined: STORAGECLASS
lemora commented 1 year ago

Hi Carlos. Could you please provide your migration job configuration? Also, are there different storage classes configured in source and target?

cfgamboa commented 1 year ago

Hello Lea,

migration move -storage=cgamboa:USERS -permanent -eager -replicas=1 -target=pgroup PRIMEDISKONLY

Each storage class has its own migration job.

All the best, Carlos

On Oct 13, 2023, at 11:18 AM, Lea @.***> wrote:

Hi Carlos. Could you please provide your migration job configuration? Also, are there different storage classes configured in source and target?

— Reply to this email directly, view it on GitHub https://github.com/dCache/dcache/issues/7146#issuecomment-1761692059, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHIHMOZBEXOVRW2IEWC33KDX7FLVHAVCNFSM6AAAAAAXVZL5HKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONRRGY4TEMBVHE. You are receiving this because you authored the thread.

cfgamboa commented 9 months ago

Hello,

On 9.2.6 I am able to see this issue again

Door log

08 Feb 2024 09:15:11 (dcdoor30_1) [door:WebDAV2-dcdoor33-externalipv6@webdav2-dcdoor33_httpsDomain:AAYQ32WnYuA dc246_7 PoolDeliverFile 0000A41D2EA94A984F59B816EB05D8F7FA85] Unexpected failure during state change notification
java.lang.IllegalStateException: Attribute is not defined: STORAGECLASS
    at org.dcache.vehicles.FileAttributes.guard(FileAttributes.java:335)
    at org.dcache.vehicles.FileAttributes.getStorageClass(FileAttributes.java:648)
    at org.dcache.pool.migration.StorageClassFilter.test(StorageClassFilter.java:19)
    at org.dcache.pool.migration.StorageClassFilter.test(StorageClassFilter.java:9)
    at java.base/java.util.function.Predicate.lambda$and$0(Predicate.java:69)
    at java.base/java.util.function.Predicate.lambda$and$0(Predicate.java:69)
    at org.dcache.pool.migration.Job.accept(Job.java:606)
    at org.dcache.pool.migration.Job.stateChanged(Job.java:655)
    at org.dcache.pool.repository.v5.StateChangeListeners.lambda$stateChanged$0(StateChangeListeners.java:60)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:840)
08 Feb 2024 09:15:11 (dcdoor30_1) [door:WebDAV2-dcdoor33-externalipv6@webdav2-dcdoor33_httpsDomain:AAYQ32WnYuA dc246_7 PoolDeliverFile 0000A41D2EA94A984F59B816EB05D8F7FA85] Unexpected failure during state change notification
java.lang.IllegalStateException: Attribute is not defined: STORAGECLASS
    at org.dcache.vehicles.FileAttributes.guard(FileAttributes.java:335)
    at org.dcache.vehicles.FileAttributes.getStorageClass(FileAttributes.java:648)
    at org.dcache.pool.migration.StorageClassFilter.test(StorageClassFilter.java:19)
    at org.dcache.pool.migration.StorageClassFilter.test(StorageClassFilter.java:9)
    at java.base/java.util.function.Predicate.lambda$and$0(Predicate.java:69)
    at java.base/java.util.function.Predicate.lambda$and$0(Predicate.java:69)
    at org.dcache.pool.migration.Job.accept(Job.java:606)
    at org.dcache.pool.migration.Job.stateChanged(Job.java:655)
    at org.dcache.pool.repository.v5.StateChangeListeners.lambda$stateChanged$0(StateChangeListeners.java:60)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:840)
08 Feb 2024 09:15:11 (dcdoor30_1) [door:WebDAV2-dcdoor33-externalipv6@webdav2-dcdoor33_httpsDomain:AAYQ32WnYuA dc246_7 PoolDeliverFile 0000A41D2EA94A984F59B816EB05D8F7FA85] Unexpected failure during state change notification
java.lang.IllegalStateException: Attribute is not defined: STORAGECLASS
    at org.dcache.vehicles.FileAttributes.guard(FileAttributes.java:335)
    at org.dcache.vehicles.FileAttributes.getStorageClass(FileAttributes.java:648)
    at org.dcache.pool.migration.StorageClassFilter.test(StorageClassFilter.java:19)
    at org.dcache.pool.migration.StorageClassFilter.test(StorageClassFilter.java:9)
    at java.base/java.util.function.Predicate.lambda$and$0(Predicate.java:69)
    at java.base/java.util.function.Predicate.lambda$and$0(Predicate.java:69)
    at org.dcache.pool.migration.Job.accept(Job.java:606)
    at org.dcache.pool.migration.Job.stateChanged(Job.java:655)
    at org.dcache.pool.repository.v5.StateChangeListeners.lambda$stateChanged$0(StateChangeListeners.java:60)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:840)

Billing log entry

02.08 09:15:10 [pool:dcdoor30_1:transfer] [0000A41D2EA94A984F59B816EB05D8F7FA85,0] [/pnfs/usatlas.bnl.gov/BNLT0D1/rucio/mc16_13TeV/12/27/AOD.37004481._001305.pool.root.1] bnlt0d1:BNLT0D1@osm 0 0 true {RemoteHttpsDataTransfer-1.1:https://lcgdpmse.dnp.fmph.uniba.sk:443/dpm/dnp.fmph.uniba.sk/home/atlas/atlasdatadisk/rucio/mc16_13TeV/12/27/AOD.37004481._001305.pool.root.1?copy_mode=pull} [door:RemoteTransferManager@dccore03Domain:1707401529498-56842] {10027:"rejected GET: 500 Server Error"}
02.08 09:15:10 [pool:dcdoor30_1@dcdoor30oneDomain:remove] [0000A41D2EA94A984F59B816EB05D8F7FA85,0] [Unknown] bnlt0d1:BNLT0D1@osm {0:"cleaner-disk@dccore03Domain [PoolRemoveFiles]"}