AAROC / CODE-RADE

Website, documentation and such for the CODE-RADE project
http://www.africa-grid.org/CODE-RADE
Apache License 2.0
5 stars 5 forks source link

Issue with new repo publications #174

Closed brucellino closed 7 years ago

brucellino commented 7 years ago

New repository transactions are not being propagated to the clients (personal machine/laptop/WN).

The transaction/publish loop seems to be working fine -

cvmfs_server publish -a foundation-release-3-3 -m "Foundation Release 3" -f fastrepo.sagrid.ac.za
Processing changes...
Waiting for upload of files before committing...
Committing file catalogs...
WARNING: catalog at / has more than 500000 entries (562688). Please consider to split it into nested catalogs.
Exporting repository manifest
Tagging fastrepo.sagrid.ac.za
Flushing file system buffers
Signing new manifest
Remounting newly created repository revision
Published changeset for fastrepo.sagrid.ac.za
ok

there no errors apart from

#012CernVM-FS is likely to run out of file descriptors, set ulimit -n to at least 8192

in syslog :

Feb  7 12:36:01 apprepo cvmfs_server: (fastrepo.sagrid.ac.za) opened transaction
Feb  7 12:36:19 apprepo cvmfs_server: (fastrepo.sagrid.ac.za) started publishing
Feb  7 12:36:31 apprepo cvmfs2: (fastrepo.sagrid.ac.za) CernVM-FS: unmounted /var/spool/cvmfs/fastrepo.sagrid.ac.za/rdonly (fastrepo.sagrid.ac.za)
Feb  7 12:36:31 apprepo cvmfs2: (fastrepo.sagrid.ac.za) Warning: current limits for number of open files are (1024/4096)#012CernVM-FS is likely to run out of file descriptors, set ulimit -n to at least 8192
Feb  7 12:36:32 apprepo cvmfs2: (fastrepo.sagrid.ac.za) CernVM-FS: linking /var/spool/cvmfs/fastrepo.sagrid.ac.za/rdonly to repository fastrepo.sagrid.ac.za
Feb  7 12:36:32 apprepo kernel: aufs test_add:262:mount[26169]: uid/gid/perm /var/spool/cvmfs/fastrepo.sagrid.ac.za/rdonly 0/0/00, 500/0/0755
Feb  7 12:36:32 apprepo cvmfs_server: (fastrepo.sagrid.ac.za) closed transaction  (asynchronous scratch cleanup)
Feb  7 12:36:33 apprepo cvmfs_server: (fastrepo.sagrid.ac.za) successfully published revision 228
CatalinCondurache commented 7 years ago

Now I can see a newer revision number (228)

[root@lcg1678 config.d]# cvmfs_config stat fastrepo.sagrid.ac.za
VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE 2.3.2.0 3879612 1371 76312 228 5 1 73209597 81920001 0 65024 0 1 -200 89068 0 http://apprepo.sagrid.ac.za/cvmfs/fastrepo.sagrid.ac.za DIRECT 1

but with the same version file

[root@lcg1678 config.d]# cat /cvmfs/fastrepo.sagrid.ac.za/version Build 217

Nesting is not an issue, and probably number of file descriptors is OK (just a warning). But why are you using '-f' option when publishing?

Rather than re-creating the repo, probably you may want to ask cvmfs-talk@cern.ch list for any ideas?

brucellino commented 7 years ago

Hi @CatalinCondurache thanks for the feedback. Yes, I've fixed both of those things (the nesting and the soft ulimit) and the issue remains.

I will bump this back to cvmfs-talk then :disappointed: Really very wierd situation.

jblomer commented 7 years ago

Hi @brucellino, hi @CatalinCondurache,

on the server, is there a different content in /cvmfs/fastrepo.sagrid.ac.za/version? Can you send me the client configuration (server url, public key) so that I can have a look?

Cheers, Jakob

brucellino commented 7 years ago

Hi @jblomer thanks for getting in touch.

There is indeed different content - it should be

This is FR3 Build 13
#174

The keys and configs are in the ansible role which configures the clients - it follows the usual directory structure.

I've also noticed something wierd when using cvmfs_talk :

cvmfs_talk -i fastrepo.sagrid.ac.za pid
Seems like CernVM-FS is not running in /var/lib/cvmfs/shared (not found: /var/lib/cvmfs/shared/cvmfs_io.fastrepo.sagrid.ac.za)
[root@apprepo fastrepo.sagrid.ac.za]# locate  cvmfs_io
/var/spool/cvmfs/devrepo.sagrid.ac.za/cache/devrepo.sagrid.ac.za/cvmfs_io.devrepo.sagrid.ac.za
/var/spool/cvmfs/fastrepo.sagrid.ac.za/cache/fastrepo.sagrid.ac.za/cvmfs_io.fastrepo.sagrid.ac.za

Should there be things under /var/spool ?

jblomer commented 7 years ago

@brucellino Regarding cvmfs_talk, this output is expected on the release manager machine. The release manager machine runs the cvmfs client in a special mode. Among other things, it stores files in /var/spool/cvmfs/... instead of /var/lib/cvmfs/.... The cvmfs_talk utility needs to be explicitly pointed to this special instance (and it needs to run as root), like

sudo cvmfs_talk -p /var/spool/cvmfs/fastrepo.sagrid.ac.za/cache/fastrepo.sagrid.ac.za/cvmfs_io.fastrepo.sagrid.ac.za pid

From the release manager machine, could you let me know the output of

cvmfs_server check
attr -g revision /var/spool/cvmfs/fastrepo.sagrid.ac.za/rdonly
cat /var/spool/cvmfs/fastrepo.sagrid.ac.za/client.local
brucellino commented 7 years ago

@jblomer - here is the info you asked for :

cvmfs_server check fastrepo.sagrid.ac.za
Verifying Catalog Integrity of fastrepo.sagrid.ac.za...
Inspecting log of references
[inspecting catalog] 2b4396605f42b8538279e661335743b8a87bed3b at /
no problems found

attr -g revision /var/spool/cvmfs/fastrepo.sagrid.ac.za/rdonly
Attribute "revision" had a 3 byte value for /var/spool/cvmfs/fastrepo.sagrid.ac.za/rdonly:
234

cat /var/spool/cvmfs/fastrepo.sagrid.ac.za/client.local
CVMFS_ROOT_HASH=2b4396605f42b8538279e661335743b8a87bed3b

I know I sound crazy, but these checks show everything is correct. I have the feeling that I'm making changes in the wrong place....

I have my content in /cvmfs/fastrepo.sagrid.ac.za which I edit while the repo is in transaction. Then I publish the changes and everything seems fine, but I get the issue described initially.

brucellino commented 7 years ago

Quick update : I've found that the repos have somewhat inconsistent permissions on the files contained within them. I'm fixing this quickly.

jblomer commented 7 years ago

That's indeed baffling. So, if I understand correctly, on the release manager machine, we have

> cat /var/spool/cvmfs/fastrepo.sagrid.ac.za/rdonly/version
Build 217

but

> cat /cvmfs/fastrepo.sagrid.ac.za/version
This is FR3 Build 13
#174

Perhaps something is wrong with the mount tree. The /cvmfs/fastrepo.sagrid.ac.za mount point is supposed to be composed from the /var/spool/cvmfs/fastrepo.sagrid.ac.za/rdonly (read-only) and /var/spool/cvmfs/fastrepo.sagrid.ac.za/scratch/current directories. This we can check with

cat /proc/mounts

Otherwise, does the release manager machine use some sort of Docker setup?

brucellino commented 7 years ago

yes, and yes - but I haven't excluded the permissions error that is still being fixed.

the mounts are :

/dev/fuse /var/spool/cvmfs/fastrepo.sagrid.ac.za/rdonly fuse ro,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other 0 0
aufs_fastrepo.sagrid.ac.za /cvmfs/fastrepo.sagrid.ac.za aufs rw,relatime,si=1b137f932e04587,udba=none 0 0

No docker is used on the release machine, no.

jblomer commented 7 years ago

Was the release manager machine recently upgraded? The writeable path changed from 2.2 to 2.3, so perhaps all the changes to the repo sit uncommitted in the wrong writable path. The changes are described in the release notes, points 3 - 6. Normally, the migration should have been taken care of by cvmfs_server migrate.

The old writable branch was /var/spool/cvmfs/fastrepo.sagrid.ac.za/scratch, the new one is /var/spool/cvmfs/fastrepo.sagrid.ac.za/scratch/current. This should be reflected by the /etc/fstab entry for /cvmfs/fastrepo.sagrid.ac.za.

brucellino commented 7 years ago

yes, the machine was upgraded I did perform a migration... but it doesn't seem to have shown up in /etc/fstab :

cvmfs2#fastrepo.sagrid.ac.za /var/spool/cvmfs/fastrepo.sagrid.ac.za/rdonly fuse allow_other,config=/etc/cvmfs/repositories.d/fastrepo.sagrid.ac.za/client.conf:/var/spool/cvmfs/fastrepo.sagrid.ac.za/client.local,cvmfs_suid,noauto 0 0 # added by CernVM-FS for fastrepo.sagrid.ac.za 
aufs_fastrepo.sagrid.ac.za /cvmfs/fastrepo.sagrid.ac.za aufs br=/var/spool/cvmfs/fastrepo.sagrid.ac.za/scratch=rw:/var/spool/cvmfs/fastrepo.sagrid.ac.za/rdonly=rr,udba=none,ro,noauto 0 0 # added by CernVM-FS for fastrepo.sagrid.ac.za 

which still shows /var/spool/cvmfs/fastrepo.sagrid.ac.za/scratch. when the chown finishes in a while, I'll re-do the migration.

brucellino commented 7 years ago

So, @jblomer some notes.

the "old" repo (fastrepo.sagrid.ac.za) seems to have been migrated, but is still showing this schizophrenic behaviour between the rdonly and the writable branch.

I created a new repo (code-rade.africa-grid.org) which works fine. It also respects the naming convention that @CatalinCondurache and I discussed a while back. So... I would like to understand what's going on with fastrepo, but since the contents are built by our CI service, I don't mind breaking it and publishing a new repo with a correct FQRN. I'm going to close this so long, we can decide whether to re-open later.