Closed efajardo closed 3 years ago
@bbockelm has an ongoing thread with XRootd Devs https://github.com/xrootd/xrootd/issues/1294.
@bbockelm @efajardo @ddavila0 any news on this?
@juztas - looked at it this weekend. The conclusion is that the changes in the ticket @efajardo mentioned (https://github.com/xrootd/xrootd/issues/1294) were insufficient. I was able to do a few more pieces of the puzzle but ultimately need Andy's help to get the rest of the information passed to the checksum job.
Will ping this thread when that ticket is more resolved. That said, with the solution proposed by Xrootd, I don't think there will be any code changes necessary to xrootd-multiuser
. Will leave this ticket open regardless.
I update here for reference (from email exchange):
5.1.1 has some issues with multiuser. Here is test below:
Reinstalled to 4.12.6:
210319 09:05:53 486194 XrdSched: running jbalcas.415133:31@login-1 inq=0
210319 09:05:53 486194 jbalcas.415133:31@login-1 XrootdProtocol: 0100 req=locate dlen=9
210319 09:05:53 486194 jbalcas.415133:31@login-1 XrootdProtocol: 0100 locate nu */storage
210319 09:05:53 486194 multiuser_UserSentry: Switching FS uid for user cmsuser
210319 09:05:53 486194 jbalcas.415133:31@login-1 XrootdProtocol: 0100 rc=-1024 locate */storage
210319 09:05:53 486194 jbalcas.415133:31@login-1 XrootdResponse: 0100 sending 34 data bytes
210319 09:05:53 486194 jbalcas.415133:31@login-1 XrootdProtocol: 0100 req=dirlist dlen=8
210319 09:05:53 486194 multiuser_UserSentry: Switching FS uid for user cmsuser
210319 09:05:53 486194 jbalcas.415133:31@login-1 oss_Opendir: lcl path /storage (/storage)
210319 09:05:53 486194 jbalcas.415133:31@login-1 XrootdResponse: 0100 sending 11 data bytes
210319 09:05:53 486194 jbalcas.415133:31@login-1 XrootdProtocol: 0100 dirlist entries=2 path=/storage
210319 09:05:55 486206 cms_Finder: Waiting for cms path /var/spool/xrootd/clustered/.olb/olbd.admin
210319 09:05:55 486255 XrdSched: running stats reporter inq=0
210319 09:05:55 486195 XrdSched: running monitor window clock inq=0
All works with multiuser (except checksum)
Reinstalled to 5.1.1 (the only diff in config change is sec.protocol https://github.com/opensciencegrid/xrootd-lcmaps/issues/53) Directory listing, copy files fail. Unsure if checksum is fixed.
210319 09:10:06 490005 cryptossl_X509::CertType: certificate has 11 extensions
INFO in AuthzKey: Returning '/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=jbalcas/CN=751133/CN=Justas Balcas::cms:/cms,/cms/becms,/cms/dcms,/cms/escms,/cms/itcms,/cms/uscms,::' of length 145 as key.
INFO in AuthzFun: Got uid 2009
INFO in AuthzFun: entity.name='cmsuser'.
INFO in AuthzFun: entity.host='login-1.ultralight.org'.
INFO in AuthzFun: entity.vorg='cms'.
INFO in AuthzFun: entity.role='null'.
INFO in AuthzFun: entity.grps='/cms /cms/becms /cms/dcms /cms/escms /cms/itcms /cms/uscms'.
INFO in AuthzFun: entity.endorsements='/cms/Role=NULL/Capability=NULL,/cms/becms/Role=NULL/Capability=NULL,/cms/dcms/Role=NULL/Capability=NULL,/cms/escms/Role=NULL/Capability=NULL,/cms/itcms/Role=NULL/Capability=NULL,/cms/uscms/Role=NULL/Capability=NULL'.
INFO in AuthzFun: entity.moninfo='/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=jbalcas/CN=751133/CN=Justas Balcas::cms:/cms,/cms/becms,/cms/dcms,/cms/escms,/cms/itcms,/cms/uscms,::'.
210319 09:10:07 490005 jbalcas.419393:31@login-1 XrootdResponse: 0000 sending OK
jbalcas.419393:31@login-1 Protocol 'gsi'
jbalcas.419393:31@login-1 Name 'cmsuser'
jbalcas.419393:31@login-1 Host 'login-1.ultralight.org'
jbalcas.419393:31@login-1 Vorg 'cms'
jbalcas.419393:31@login-1 Role ''
jbalcas.419393:31@login-1 Grps '/cms /cms/becms /cms/dcms /cms/escms /cms/itcms /cms/uscms'
jbalcas.419393:31@login-1 Caps ''
jbalcas.419393:31@login-1 Pidn 'jbalcas.419393:31@login-1'
jbalcas.419393:31@login-1 Crlen 9270
jbalcas.419393:31@login-1 ueid 1
jbalcas.419393:31@login-1 uid 0
jbalcas.419393:31@login-1 gid 0
210319 09:10:07 490005 XrootdMonitor: 360 bytes sent to 169.228.130.91:9930 rc=0
210319 09:10:07 490005 XrootdMonitor: 360 bytes sent to xrootd-mon.unl.edu:9930 rc=0
210319 09:10:07 490005 XrootdXeq: jbalcas.419393:31@login-1 pub IPv4 login as cmsuser
210319 09:10:07 490005 jbalcas.419393:31@login-1 XrootdProtocol: 0100 req=locate dlen=10
210319 09:10:07 490005 jbalcas.419393:31@login-1 XrootdProtocol: 0100 locate nuD */storage/
210319 09:10:07 490005 multiuser_UserSentry: Switching FS uid for user cmsuser
210319 09:10:07 490005 jbalcas.419393:31@login-1 XrootdProtocol: 0100 rc=-1024 locate */storage/
210319 09:10:07 490005 jbalcas.419393:31@login-1 XrootdResponse: 0100 sending 34 data bytes
210319 09:10:07 490005 jbalcas.419393:31@login-1 XrootdProtocol: 0100 req=dirlist dlen=9
210319 09:10:07 490005 multiuser_UserSentry: Switching FS uid for user cmsuser
210319 09:10:07 490005 ofs_opendir: jbalcas.419393:31@login-1 Unable to open directory /storage/; permission denied
210319 09:10:07 490005 jbalcas.419393:31@login-1 XrootdResponse: 0100 sending err 3010: Unable to open directory /storage/; permission denied
@abh3 you made some changes to xrootd-multiuser
recently to support XRootD 5.1. Any thoughts?
@juztas - could you perhaps do a strace
and see if the fsuid is actually getting changed as expected?
Looks like it is, see output here: https://login-1.hep.caltech.edu/~jbalcas/strace-xrootd
So, can we close the ticket?
On Fri, 19 Mar 2021, Justas Bal?as wrote:
Looks like it is, see output here: https://login-1.hep.caltech.edu/~jbalcas/strace-xrootd
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/opensciencegrid/xrootd-multiuser/issues/14#issuecomment-803048312
@juztas - I think it might be a config setting in 5.1; in the strace output, I don't see any attempt to access the directory at all.
Is it possible to setup with wider xrootd authorizations (set to read all) to try out that theory?
by xrootd authorizations, you mean? change auth_file? It already has 'a' under / for cmsuser account
On Mon, 22 Mar 2021 at 07:51, Brian P Bockelman @.***> wrote:
@juztas https://github.com/juztas - I think it might be a config setting in 5.1; in the strace output, I don't see any attempt to access the directory at all.
Is it possible to setup with wider xrootd authorizations (set to read all) to try out that theory?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/opensciencegrid/xrootd-multiuser/issues/14#issuecomment-804123181, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE53NRBZPAH5VE2PGIAGV3TE5KOJANCNFSM4SWOJEFQ .
I want to emphasize that this will only work when you pair the latest xroot release with the latest multi-user plugin. Any other combination will give you the results you noted (i.e. it won't work). Please make absolutely sure that combination is being used.
That is the case (afaik) and all rpms are latest 5.1.1 from osg 3.5 upcoming-testing.
xrootd-5.1.1-1.1.osg35up.el7.x86_64
xrootd-client-5.1.1-1.1.osg35up.el7.x86_64
xrootd-client-libs-5.1.1-1.1.osg35up.el7.x86_64
xrootd-lcmaps-1.7.8-3.osgup.el7.x86_64
xrootd-libs-5.1.1-1.1.osg35up.el7.x86_64
xrootd-multiuser-0.5.0-1.osg35up.el7.x86_64
xrootd-scitokens-5.1.1-1.1.osg35up.el7.x86_64
xrootd-selinux-5.1.1-1.1.osg35up.el7.noarch
xrootd-server-5.1.1-1.1.osg35up.el7.x86_64
xrootd-server-libs-5.1.1-1.1.osg35up.el7.x86_64
@matyasselmeci looked at our config and osg-profile. (nothing came up as an issue on my end)
@juztas - thanks for double-checking the versions. Could you re-run the strace
test and post again? Even if the symptom is the same I want to rule out a different potential cause.
Here it is: https://login-1.hep.caltech.edu/~jbalcas/strace-xrootd-new
On Mon, 22 Mar 2021 at 14:56, Brian P Bockelman @.***> wrote:
@juztas https://github.com/juztas - thanks for double-checking the versions. Could you re-run the strace test and post again? Even if the symptom is the same I want to rule out a different potential cause.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/opensciencegrid/xrootd-multiuser/issues/14#issuecomment-804422391, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE53NXAQWWA2RESBMKJQVTTE64KRANCNFSM4SWOJEFQ .
Well, according to the strace the setfsuid/gid calls are happening but it's failing because "Unable to locate /storage/cms/". How come?
Andy
On Mon, 22 Mar 2021, Justas Bal?as wrote:
Here it is: https://login-1.hep.caltech.edu/~jbalcas/strace-xrootd-new
On Mon, 22 Mar 2021 at 14:56, Brian P Bockelman @.***> wrote:
@juztas https://github.com/juztas - thanks for double-checking the versions. Could you re-run the strace test and post again? Even if the symptom is the same I want to rule out a different potential cause.
? You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/opensciencegrid/xrootd-multiuser/issues/14#issuecomment-804422391, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE53NXAQWWA2RESBMKJQVTTE64KRANCNFSM4SWOJEFQ .
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/opensciencegrid/xrootd-multiuser/issues/14#issuecomment-804451456
@juztas - I'm puzzled here as I'm not seeing the actual filesystem calls. Ignoring filesystem permissions - does 5.1.1 work without the multiuser plugin?
I started by disabling one or another config parameter (sec, multiuser,
ofs.authorize)... Once I disable ofs.authorize 1
- read access works.
Write - received this: Mar 23 08:17:30 transfer-10 systemd: @.***: main process exited, code=killed, status=64/RTMIN+30 - strace hereL https://login-1.hep.caltech.edu/~jbalcas/strace-xrootd-new5
For ofs.authorize - I am unsure what is an issue (the same config worked on 4.x series and all is open to read for cmsuser): all.export / and in auth_file
t readcmsdata /storage/ lr
u cmsuser / a
u cmsuser /test/ a
u cmsuser /storage/ a
g /cms /storage a readcmsdata
On Tue, 23 Mar 2021 at 06:02, Brian P Bockelman @.***> wrote:
@juztas https://github.com/juztas - I'm puzzled here as I'm not seeing the actual filesystem calls. Ignoring filesystem permissions - does 5.1.1 work without the multiuser plugin?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/opensciencegrid/xrootd-multiuser/issues/14#issuecomment-804885046, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE53NXKLOQSICQ4SUMZKZ3TFCGPTANCNFSM4SWOJEFQ .
What was the original ofs.authorize directive that was removed? By removing that directive, it will work because nothing is authorizing the request and anything goes (i.e. everything is valid).
On Tue, 23 Mar 2021, Justas Bal?as wrote:
I started by disabling one or another config parameter (sec, multiuser, ofs.authorize)... Once I disable
ofs.authorize 1
- read access works.Write - received this: Mar 23 08:17:30 transfer-10 systemd: @.***: main process exited, code=killed, status=64/RTMIN+30 - strace hereL https://login-1.hep.caltech.edu/~jbalcas/strace-xrootd-new5
For ofs.authorize - I am unsure what is an issue (the same config worked on 4.x series and all is open to read for cmsuser): all.export / and in auth_file
t readcmsdata /storage/ lr u cmsuser / a u cmsuser /test/ a u cmsuser /storage/ a g /cms /storage a readcmsdata
On Tue, 23 Mar 2021 at 06:02, Brian P Bockelman @.***> wrote:
@juztas https://github.com/juztas - I'm puzzled here as I'm not seeing the actual filesystem calls. Ignoring filesystem permissions - does 5.1.1 work without the multiuser plugin?
? You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/opensciencegrid/xrootd-multiuser/issues/14#issuecomment-804885046, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE53NXKLOQSICQ4SUMZKZ3TFCGPTANCNFSM4SWOJEFQ .
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/opensciencegrid/xrootd-multiuser/issues/14#issuecomment-804995869
I just commented out ofs.authorize 1
.
Well, then authorization is turned off and you are simply relying on the fsuid/gid setting. So, I'm not suiprised it works. I am suprised that it stopped working.
On Tue, 23 Mar 2021, Justas Bal?as wrote:
I just commented out
ofs.authorize 1
.-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/opensciencegrid/xrootd-multiuser/issues/14#issuecomment-805267724
ofs.authorize issue solved. Issue was due to new configuration syntax: Before:
ofs.authlib libXrdAccSciTokens.so libXrdMacaroons.so
After (good):
ofs.authlib ++ libXrdAccSciTokens.so
ofs.authlib ++ libXrdMacaroons.so
It works if I am not using multiuser plugin (read, dir listing, etc) - but does not work with multiuser. So this seems an issue on multiuser plugin:
Plugin loaded Multiuser v5.1.1 from fslib libXrdMultiuser-5.so
------ Initializing the multi-user plugin.
=====> multiuser.umask 0022
=====> ofs.authlib ++ libXrdAccSciTokens.so
=====> ofs.authlib ++ libXrdMacaroons.so
210323 14:54:23 3815349 multiuser_Config: Failed to load ++-5 ++-5: cannot open shared object file: No such file or directory
210323 14:54:23 3815349 multiuser_Initialize: Encountered a runtime failure: Failed to configure multi-user plugin.
210323 14:54:23 3815349 XrootdConfig: Unable to load file system via libXrdMultiuser.so
210323 14:54:23 3815349 XrootdConfig: Unable to load file system wrapper from libXrdMultiuser.so
------ xroot protocol initialization failed.
Hi Justas,
Ah, I was not aware that multi-user reparses the configuration and is sensitive to those directives. Sigh, another thing that should not have been done. Now, I will need to find out why it was done that way in the first place.
Andy
On Tue, 23 Mar 2021, Justas Bal?as wrote:
ofs.authorize issue solved. Issue was due to new configuration syntax: Before:
ofs.authlib libXrdAccSciTokens.so libXrdMacaroons.so
After (good):
ofs.authlib ++ libXrdAccSciTokens.so ofs.authlib ++ libXrdMacaroons.so
It works if I am not using multiuser plugin (read, dir listing, etc) - but does not work with multiuser. So this seems an issue on multiuser plugin:
Plugin loaded Multiuser v5.1.1 from fslib libXrdMultiuser-5.so ------ Initializing the multi-user plugin. =====> multiuser.umask 0022 =====> ofs.authlib ++ libXrdAccSciTokens.so =====> ofs.authlib ++ libXrdMacaroons.so 210323 14:54:23 3815349 multiuser_Config: Failed to load ++-5 ++-5: cannot open shared object file: No such file or directory 210323 14:54:23 3815349 multiuser_Initialize: Encountered a runtime failure: Failed to configure multi-user plugin. 210323 14:54:23 3815349 XrootdConfig: Unable to load file system via libXrdMultiuser.so 210323 14:54:23 3815349 XrootdConfig: Unable to load file system wrapper from libXrdMultiuser.so ------ xroot protocol initialization failed.
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/opensciencegrid/xrootd-multiuser/issues/14#issuecomment-805293444
This is fixed in #22
On freshdesk ticket #65647 Justas found out the incompatibility between multiuser and checksums.
@ddavila0 talk with XRootD devs about this and I am opening this ticket as a placeholder for the discussions on how to fix this.