Open ingridbr opened 3 years ago
That is surprising - we'll look into reproducing it here.
As a workaround in the meantime, you can run iadmin modresc default rebalance
to heal any missing/stale replicas.
Thought streaming (open/write/close) was the culprit...
but... istream
also appears to work as expected against a similar tree in 4.2.8 with pt
defined as the default resource:
$ ienv | grep version
irods_version - 4.2.8
$ ilsresc
demoResc:unixfilesystem
pt:passthru
└── r1:random
├── repl1:replication
│ ├── ufs1:unixfilesystem
│ └── ufs11:unixfilesystem
└── repl2:replication
├── ufs2:unixfilesystem
└── ufs22:unixfilesystem
$ iput rules.ninja withput
$ echo "perhaps" | istream write withstream
$ ils -l
/tempZone/home/rods:
rods 0 pt;r1;repl2;ufs22 51592 2020-12-09.21:10 & withput
rods 1 pt;r1;repl2;ufs2 51592 2020-12-09.21:10 & withput
rods 0 pt;r1;repl1;ufs11 8 2020-12-09.21:10 & withstream
rods 1 pt;r1;repl1;ufs1 8 2020-12-09.21:10 & withstream
This is likely caused by the openType not being set in the dataObjInp when calling rsDataObjOpen. fileModified is only triggered when the openType is set to CREATE_TYPE or OPEN_FOR_WRITE_TYPE:
The regParam passed to this API is constructed using the keywords found in the L1 descriptor, which is populated in open.
See replica_close API plugin:
...and rsDataObjClose API:
https://github.com/irods/irods/blob/9eb6c23df45cdedff8ee9c3af71a65d304635037/server/api/src/rsDataObjClose.cpp#L306 https://github.com/irods/irods/blob/9eb6c23df45cdedff8ee9c3af71a65d304635037/server/api/src/rsDataObjClose.cpp#L587
The latest NFSRODS commits do not fix this issue (NFSRODS: 77b54c9, iRODS: https://github.com/irods/irods/commit/9c57ce9).
However, the results show that additional replicas are created. The good replica has the correct size while the stale replica does not.
$ ilsresc
demoResc:unixfilesystem
pt:passthru
└── repl:replication
├── ufs0:unixfilesystem
└── ufs1:unixfilesystem
$ cp <file> /mnt/nfsrods/foo # NFSRODS is configured to target the "pt" resource.
$ ils -l
/tempZone/home/kory:
kory 0 pt;repl;ufs0 2001391 2022-01-26.13:55 & foo
kory 1 pt;repl;ufs1 0 2022-01-26.13:55 X foo
We're close, but this still needs some work.
Kory if we can chat tomorrow we can issue a patch
sure thing.
I believe this is now handled in NFSRODS 2.1.0 due to Jargon 4.3.2.5-SNAPSHOT.
We'll need to verify that using various file sizes.
Confirmed PR #202 does not resolve this issue.
The file is uploaded correctly. The first replica has the correct size and is marked good. The physical size is correct too.
The second replica has a size of 0 in the catalog and is marked stale. The second replica's physical size is 0.
We are testing NFSRODS (v1.0.0) and iRODS 4.2.8 cluster with the followin composable resource defined:
and we are using with this configuration the nfsrods server:
With this setup when we copy a file to the NFS mount point using the cp command we see that only one of replicas is created:
When doing an iput of the same file on the same directory the the 2 replicas are created (as expected):
In both cases the same resource is used (default) but it seems that with NFSRODS the second copy is not done. We do not see any error on the irods log.