Closed brucellino closed 7 years ago
Now I can see a newer revision number (228)
[root@lcg1678 config.d]# cvmfs_config stat fastrepo.sagrid.ac.za
VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE
2.3.2.0 3879612 1371 76312 228 5 1 73209597 81920001 0 65024 0 1 -200 89068 0 http://apprepo.sagrid.ac.za/cvmfs/fastrepo.sagrid.ac.za DIRECT 1
but with the same version file
[root@lcg1678 config.d]# cat /cvmfs/fastrepo.sagrid.ac.za/version Build 217
Nesting is not an issue, and probably number of file descriptors is OK (just a warning). But why are you using '-f' option when publishing?
Rather than re-creating the repo, probably you may want to ask cvmfs-talk@cern.ch list for any ideas?
Hi @CatalinCondurache thanks for the feedback. Yes, I've fixed both of those things (the nesting and the soft ulimit) and the issue remains.
I will bump this back to cvmfs-talk then :disappointed: Really very wierd situation.
Hi @brucellino, hi @CatalinCondurache,
on the server, is there a different content in /cvmfs/fastrepo.sagrid.ac.za/version
? Can you send me the client configuration (server url, public key) so that I can have a look?
Cheers, Jakob
Hi @jblomer thanks for getting in touch.
There is indeed different content - it should be
This is FR3 Build 13
#174
The keys and configs are in the ansible role which configures the clients - it follows the usual directory structure.
I've also noticed something wierd when using cvmfs_talk
:
cvmfs_talk -i fastrepo.sagrid.ac.za pid
Seems like CernVM-FS is not running in /var/lib/cvmfs/shared (not found: /var/lib/cvmfs/shared/cvmfs_io.fastrepo.sagrid.ac.za)
[root@apprepo fastrepo.sagrid.ac.za]# locate cvmfs_io
/var/spool/cvmfs/devrepo.sagrid.ac.za/cache/devrepo.sagrid.ac.za/cvmfs_io.devrepo.sagrid.ac.za
/var/spool/cvmfs/fastrepo.sagrid.ac.za/cache/fastrepo.sagrid.ac.za/cvmfs_io.fastrepo.sagrid.ac.za
Should there be things under /var/spool
?
@brucellino Regarding cvmfs_talk
, this output is expected on the release manager machine. The release manager machine runs the cvmfs client in a special mode. Among other things, it stores files in /var/spool/cvmfs/...
instead of /var/lib/cvmfs/...
. The cvmfs_talk
utility needs to be explicitly pointed to this special instance (and it needs to run as root), like
sudo cvmfs_talk -p /var/spool/cvmfs/fastrepo.sagrid.ac.za/cache/fastrepo.sagrid.ac.za/cvmfs_io.fastrepo.sagrid.ac.za pid
From the release manager machine, could you let me know the output of
cvmfs_server check
attr -g revision /var/spool/cvmfs/fastrepo.sagrid.ac.za/rdonly
cat /var/spool/cvmfs/fastrepo.sagrid.ac.za/client.local
@jblomer - here is the info you asked for :
cvmfs_server check fastrepo.sagrid.ac.za
Verifying Catalog Integrity of fastrepo.sagrid.ac.za...
Inspecting log of references
[inspecting catalog] 2b4396605f42b8538279e661335743b8a87bed3b at /
no problems found
attr -g revision /var/spool/cvmfs/fastrepo.sagrid.ac.za/rdonly
Attribute "revision" had a 3 byte value for /var/spool/cvmfs/fastrepo.sagrid.ac.za/rdonly:
234
cat /var/spool/cvmfs/fastrepo.sagrid.ac.za/client.local
CVMFS_ROOT_HASH=2b4396605f42b8538279e661335743b8a87bed3b
I know I sound crazy, but these checks show everything is correct. I have the feeling that I'm making changes in the wrong place....
I have my content in /cvmfs/fastrepo.sagrid.ac.za
which I edit while the repo is in transaction.
Then I publish the changes and everything seems fine, but I get the issue described initially.
Quick update : I've found that the repos have somewhat inconsistent permissions on the files contained within them. I'm fixing this quickly.
That's indeed baffling. So, if I understand correctly, on the release manager machine, we have
> cat /var/spool/cvmfs/fastrepo.sagrid.ac.za/rdonly/version
Build 217
but
> cat /cvmfs/fastrepo.sagrid.ac.za/version
This is FR3 Build 13
#174
Perhaps something is wrong with the mount tree. The /cvmfs/fastrepo.sagrid.ac.za mount point is supposed to be composed from the /var/spool/cvmfs/fastrepo.sagrid.ac.za/rdonly
(read-only) and /var/spool/cvmfs/fastrepo.sagrid.ac.za/scratch/current
directories. This we can check with
cat /proc/mounts
Otherwise, does the release manager machine use some sort of Docker setup?
yes, and yes - but I haven't excluded the permissions error that is still being fixed.
the mounts are :
/dev/fuse /var/spool/cvmfs/fastrepo.sagrid.ac.za/rdonly fuse ro,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other 0 0
aufs_fastrepo.sagrid.ac.za /cvmfs/fastrepo.sagrid.ac.za aufs rw,relatime,si=1b137f932e04587,udba=none 0 0
No docker is used on the release machine, no.
Was the release manager machine recently upgraded? The writeable path changed from 2.2 to 2.3, so perhaps all the changes to the repo sit uncommitted in the wrong writable path. The changes are described in the release notes, points 3 - 6. Normally, the migration should have been taken care of by cvmfs_server migrate
.
The old writable branch was /var/spool/cvmfs/fastrepo.sagrid.ac.za/scratch
, the new one is /var/spool/cvmfs/fastrepo.sagrid.ac.za/scratch/current
. This should be reflected by the /etc/fstab
entry for /cvmfs/fastrepo.sagrid.ac.za
.
yes, the machine was upgraded I did perform a migration... but it doesn't seem to have shown up in /etc/fstab
:
cvmfs2#fastrepo.sagrid.ac.za /var/spool/cvmfs/fastrepo.sagrid.ac.za/rdonly fuse allow_other,config=/etc/cvmfs/repositories.d/fastrepo.sagrid.ac.za/client.conf:/var/spool/cvmfs/fastrepo.sagrid.ac.za/client.local,cvmfs_suid,noauto 0 0 # added by CernVM-FS for fastrepo.sagrid.ac.za
aufs_fastrepo.sagrid.ac.za /cvmfs/fastrepo.sagrid.ac.za aufs br=/var/spool/cvmfs/fastrepo.sagrid.ac.za/scratch=rw:/var/spool/cvmfs/fastrepo.sagrid.ac.za/rdonly=rr,udba=none,ro,noauto 0 0 # added by CernVM-FS for fastrepo.sagrid.ac.za
which still shows /var/spool/cvmfs/fastrepo.sagrid.ac.za/scratch
. when the chown
finishes in a while, I'll re-do the migration.
So, @jblomer some notes.
the "old" repo (fastrepo.sagrid.ac.za) seems to have been migrated, but is still showing this schizophrenic behaviour between the rdonly and the writable branch.
I created a new repo (code-rade.africa-grid.org) which works fine. It also respects the naming convention that @CatalinCondurache and I discussed a while back. So... I would like to understand what's going on with fastrepo, but since the contents are built by our CI service, I don't mind breaking it and publishing a new repo with a correct FQRN. I'm going to close this so long, we can decide whether to re-open later.
New repository transactions are not being propagated to the clients (personal machine/laptop/WN).
The transaction/publish loop seems to be working fine -
there no errors apart from
in syslog :