opensciencegrid / xrootd-multiuser

A filesystem plugin to allow Xrootd write as a different Unix user
Apache License 2.0
2 stars 12 forks source link

Multiuser incompatibility with checksums #14

Closed efajardo closed 3 years ago

efajardo commented 3 years ago

On freshdesk ticket #65647 Justas found out the incompatibility between multiuser and checksums.

@ddavila0 talk with XRootD devs about this and I am opening this ticket as a placeholder for the discussions on how to fix this.

efajardo commented 3 years ago

@bbockelm has an ongoing thread with XRootd Devs https://github.com/xrootd/xrootd/issues/1294.

juztas commented 3 years ago

@bbockelm @efajardo @ddavila0 any news on this?

bbockelm commented 3 years ago

@juztas - looked at it this weekend. The conclusion is that the changes in the ticket @efajardo mentioned (https://github.com/xrootd/xrootd/issues/1294) were insufficient. I was able to do a few more pieces of the puzzle but ultimately need Andy's help to get the rest of the information passed to the checksum job.

Will ping this thread when that ticket is more resolved. That said, with the solution proposed by Xrootd, I don't think there will be any code changes necessary to xrootd-multiuser. Will leave this ticket open regardless.

juztas commented 3 years ago

I update here for reference (from email exchange):

5.1.1 has some issues with multiuser. Here is test below:

Reinstalled to 4.12.6:

210319 09:05:53 486194 XrdSched: running jbalcas.415133:31@login-1 inq=0
210319 09:05:53 486194 jbalcas.415133:31@login-1 XrootdProtocol: 0100 req=locate dlen=9
210319 09:05:53 486194 jbalcas.415133:31@login-1 XrootdProtocol: 0100 locate nu */storage
210319 09:05:53 486194 multiuser_UserSentry: Switching FS uid for user cmsuser
210319 09:05:53 486194 jbalcas.415133:31@login-1 XrootdProtocol: 0100 rc=-1024 locate */storage
210319 09:05:53 486194 jbalcas.415133:31@login-1 XrootdResponse: 0100 sending 34 data bytes
210319 09:05:53 486194 jbalcas.415133:31@login-1 XrootdProtocol: 0100 req=dirlist dlen=8
210319 09:05:53 486194 multiuser_UserSentry: Switching FS uid for user cmsuser
210319 09:05:53 486194 jbalcas.415133:31@login-1 oss_Opendir: lcl path /storage (/storage)
210319 09:05:53 486194 jbalcas.415133:31@login-1 XrootdResponse: 0100 sending 11 data bytes
210319 09:05:53 486194 jbalcas.415133:31@login-1 XrootdProtocol: 0100 dirlist entries=2 path=/storage
210319 09:05:55 486206 cms_Finder: Waiting for cms path /var/spool/xrootd/clustered/.olb/olbd.admin
210319 09:05:55 486255 XrdSched: running stats reporter inq=0
210319 09:05:55 486195 XrdSched: running monitor window clock inq=0

All works with multiuser (except checksum)

Reinstalled to 5.1.1 (the only diff in config change is sec.protocol https://github.com/opensciencegrid/xrootd-lcmaps/issues/53) Directory listing, copy files fail. Unsure if checksum is fixed.

210319 09:10:06 490005 cryptossl_X509::CertType: certificate has 11 extensions
INFO in AuthzKey: Returning '/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=jbalcas/CN=751133/CN=Justas Balcas::cms:/cms,/cms/becms,/cms/dcms,/cms/escms,/cms/itcms,/cms/uscms,::' of length 145 as key.
INFO in AuthzFun: Got uid 2009
INFO in AuthzFun: entity.name='cmsuser'.
INFO in AuthzFun: entity.host='login-1.ultralight.org'.
INFO in AuthzFun: entity.vorg='cms'.
INFO in AuthzFun: entity.role='null'.
INFO in AuthzFun: entity.grps='/cms /cms/becms /cms/dcms /cms/escms /cms/itcms /cms/uscms'.
INFO in AuthzFun: entity.endorsements='/cms/Role=NULL/Capability=NULL,/cms/becms/Role=NULL/Capability=NULL,/cms/dcms/Role=NULL/Capability=NULL,/cms/escms/Role=NULL/Capability=NULL,/cms/itcms/Role=NULL/Capability=NULL,/cms/uscms/Role=NULL/Capability=NULL'.
INFO in AuthzFun: entity.moninfo='/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=jbalcas/CN=751133/CN=Justas Balcas::cms:/cms,/cms/becms,/cms/dcms,/cms/escms,/cms/itcms,/cms/uscms,::'.
210319 09:10:07 490005 jbalcas.419393:31@login-1 XrootdResponse: 0000 sending OK
jbalcas.419393:31@login-1 Protocol 'gsi'
jbalcas.419393:31@login-1 Name 'cmsuser'
jbalcas.419393:31@login-1 Host 'login-1.ultralight.org'
jbalcas.419393:31@login-1 Vorg 'cms'
jbalcas.419393:31@login-1 Role ''
jbalcas.419393:31@login-1 Grps '/cms /cms/becms /cms/dcms /cms/escms /cms/itcms /cms/uscms'
jbalcas.419393:31@login-1 Caps ''
jbalcas.419393:31@login-1 Pidn 'jbalcas.419393:31@login-1'
jbalcas.419393:31@login-1 Crlen 9270
jbalcas.419393:31@login-1 ueid  1
jbalcas.419393:31@login-1 uid   0
jbalcas.419393:31@login-1 gid   0
210319 09:10:07 490005 XrootdMonitor: 360 bytes sent to 169.228.130.91:9930 rc=0
210319 09:10:07 490005 XrootdMonitor: 360 bytes sent to xrootd-mon.unl.edu:9930 rc=0
210319 09:10:07 490005 XrootdXeq: jbalcas.419393:31@login-1 pub IPv4 login as cmsuser
210319 09:10:07 490005 jbalcas.419393:31@login-1 XrootdProtocol: 0100 req=locate dlen=10
210319 09:10:07 490005 jbalcas.419393:31@login-1 XrootdProtocol: 0100 locate nuD */storage/
210319 09:10:07 490005 multiuser_UserSentry: Switching FS uid for user cmsuser
210319 09:10:07 490005 jbalcas.419393:31@login-1 XrootdProtocol: 0100 rc=-1024 locate */storage/
210319 09:10:07 490005 jbalcas.419393:31@login-1 XrootdResponse: 0100 sending 34 data bytes
210319 09:10:07 490005 jbalcas.419393:31@login-1 XrootdProtocol: 0100 req=dirlist dlen=9
210319 09:10:07 490005 multiuser_UserSentry: Switching FS uid for user cmsuser
210319 09:10:07 490005 ofs_opendir: jbalcas.419393:31@login-1 Unable to open directory /storage/; permission denied
210319 09:10:07 490005 jbalcas.419393:31@login-1 XrootdResponse: 0100 sending err 3010: Unable to open directory /storage/; permission denied
brianhlin commented 3 years ago

@abh3 you made some changes to xrootd-multiuser recently to support XRootD 5.1. Any thoughts?

bbockelm commented 3 years ago

@juztas - could you perhaps do a strace and see if the fsuid is actually getting changed as expected?

juztas commented 3 years ago

Looks like it is, see output here: https://login-1.hep.caltech.edu/~jbalcas/strace-xrootd

abh3 commented 3 years ago

So, can we close the ticket?

On Fri, 19 Mar 2021, Justas Bal?as wrote:

Looks like it is, see output here: https://login-1.hep.caltech.edu/~jbalcas/strace-xrootd

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/opensciencegrid/xrootd-multiuser/issues/14#issuecomment-803048312

juztas commented 3 years ago

No, see https://github.com/opensciencegrid/xrootd-multiuser/issues/14#issuecomment-802972629

bbockelm commented 3 years ago

@juztas - I think it might be a config setting in 5.1; in the strace output, I don't see any attempt to access the directory at all.

Is it possible to setup with wider xrootd authorizations (set to read all) to try out that theory?

juztas commented 3 years ago

by xrootd authorizations, you mean? change auth_file? It already has 'a' under / for cmsuser account

On Mon, 22 Mar 2021 at 07:51, Brian P Bockelman @.***> wrote:

@juztas https://github.com/juztas - I think it might be a config setting in 5.1; in the strace output, I don't see any attempt to access the directory at all.

Is it possible to setup with wider xrootd authorizations (set to read all) to try out that theory?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/opensciencegrid/xrootd-multiuser/issues/14#issuecomment-804123181, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE53NRBZPAH5VE2PGIAGV3TE5KOJANCNFSM4SWOJEFQ .

abh3 commented 3 years ago

I want to emphasize that this will only work when you pair the latest xroot release with the latest multi-user plugin. Any other combination will give you the results you noted (i.e. it won't work). Please make absolutely sure that combination is being used.

juztas commented 3 years ago

That is the case (afaik) and all rpms are latest 5.1.1 from osg 3.5 upcoming-testing.

xrootd-5.1.1-1.1.osg35up.el7.x86_64
xrootd-client-5.1.1-1.1.osg35up.el7.x86_64
xrootd-client-libs-5.1.1-1.1.osg35up.el7.x86_64
xrootd-lcmaps-1.7.8-3.osgup.el7.x86_64
xrootd-libs-5.1.1-1.1.osg35up.el7.x86_64
xrootd-multiuser-0.5.0-1.osg35up.el7.x86_64
xrootd-scitokens-5.1.1-1.1.osg35up.el7.x86_64
xrootd-selinux-5.1.1-1.1.osg35up.el7.noarch
xrootd-server-5.1.1-1.1.osg35up.el7.x86_64
xrootd-server-libs-5.1.1-1.1.osg35up.el7.x86_64

@matyasselmeci looked at our config and osg-profile. (nothing came up as an issue on my end)

bbockelm commented 3 years ago

@juztas - thanks for double-checking the versions. Could you re-run the strace test and post again? Even if the symptom is the same I want to rule out a different potential cause.

juztas commented 3 years ago

Here it is: https://login-1.hep.caltech.edu/~jbalcas/strace-xrootd-new

On Mon, 22 Mar 2021 at 14:56, Brian P Bockelman @.***> wrote:

@juztas https://github.com/juztas - thanks for double-checking the versions. Could you re-run the strace test and post again? Even if the symptom is the same I want to rule out a different potential cause.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/opensciencegrid/xrootd-multiuser/issues/14#issuecomment-804422391, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE53NXAQWWA2RESBMKJQVTTE64KRANCNFSM4SWOJEFQ .

abh3 commented 3 years ago

Well, according to the strace the setfsuid/gid calls are happening but it's failing because "Unable to locate /storage/cms/". How come?

Andy

On Mon, 22 Mar 2021, Justas Bal?as wrote:

Here it is: https://login-1.hep.caltech.edu/~jbalcas/strace-xrootd-new

On Mon, 22 Mar 2021 at 14:56, Brian P Bockelman @.***> wrote:

@juztas https://github.com/juztas - thanks for double-checking the versions. Could you re-run the strace test and post again? Even if the symptom is the same I want to rule out a different potential cause.

? You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/opensciencegrid/xrootd-multiuser/issues/14#issuecomment-804422391, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE53NXAQWWA2RESBMKJQVTTE64KRANCNFSM4SWOJEFQ .

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/opensciencegrid/xrootd-multiuser/issues/14#issuecomment-804451456

bbockelm commented 3 years ago

@juztas - I'm puzzled here as I'm not seeing the actual filesystem calls. Ignoring filesystem permissions - does 5.1.1 work without the multiuser plugin?

juztas commented 3 years ago

I started by disabling one or another config parameter (sec, multiuser, ofs.authorize)... Once I disable ofs.authorize 1 - read access works.

Write - received this: Mar 23 08:17:30 transfer-10 systemd: @.***: main process exited, code=killed, status=64/RTMIN+30 - strace hereL https://login-1.hep.caltech.edu/~jbalcas/strace-xrootd-new5

For ofs.authorize - I am unsure what is an issue (the same config worked on 4.x series and all is open to read for cmsuser): all.export / and in auth_file

t readcmsdata  /storage/                   lr

u cmsuser / a
u cmsuser /test/ a
u cmsuser /storage/ a
g /cms /storage a readcmsdata

On Tue, 23 Mar 2021 at 06:02, Brian P Bockelman @.***> wrote:

@juztas https://github.com/juztas - I'm puzzled here as I'm not seeing the actual filesystem calls. Ignoring filesystem permissions - does 5.1.1 work without the multiuser plugin?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/opensciencegrid/xrootd-multiuser/issues/14#issuecomment-804885046, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE53NXKLOQSICQ4SUMZKZ3TFCGPTANCNFSM4SWOJEFQ .

abh3 commented 3 years ago

What was the original ofs.authorize directive that was removed? By removing that directive, it will work because nothing is authorizing the request and anything goes (i.e. everything is valid).

On Tue, 23 Mar 2021, Justas Bal?as wrote:

I started by disabling one or another config parameter (sec, multiuser, ofs.authorize)... Once I disable ofs.authorize 1 - read access works.

Write - received this: Mar 23 08:17:30 transfer-10 systemd: @.***: main process exited, code=killed, status=64/RTMIN+30 - strace hereL https://login-1.hep.caltech.edu/~jbalcas/strace-xrootd-new5

For ofs.authorize - I am unsure what is an issue (the same config worked on 4.x series and all is open to read for cmsuser): all.export / and in auth_file

t readcmsdata  /storage/                   lr

u cmsuser / a
u cmsuser /test/ a
u cmsuser /storage/ a
g /cms /storage a readcmsdata

On Tue, 23 Mar 2021 at 06:02, Brian P Bockelman @.***> wrote:

@juztas https://github.com/juztas - I'm puzzled here as I'm not seeing the actual filesystem calls. Ignoring filesystem permissions - does 5.1.1 work without the multiuser plugin?

? You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/opensciencegrid/xrootd-multiuser/issues/14#issuecomment-804885046, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE53NXKLOQSICQ4SUMZKZ3TFCGPTANCNFSM4SWOJEFQ .

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/opensciencegrid/xrootd-multiuser/issues/14#issuecomment-804995869

juztas commented 3 years ago

I just commented out ofs.authorize 1.

abh3 commented 3 years ago

Well, then authorization is turned off and you are simply relying on the fsuid/gid setting. So, I'm not suiprised it works. I am suprised that it stopped working.

On Tue, 23 Mar 2021, Justas Bal?as wrote:

I just commented out ofs.authorize 1.

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/opensciencegrid/xrootd-multiuser/issues/14#issuecomment-805267724

juztas commented 3 years ago

ofs.authorize issue solved. Issue was due to new configuration syntax: Before:

ofs.authlib libXrdAccSciTokens.so libXrdMacaroons.so

After (good):

ofs.authlib ++ libXrdAccSciTokens.so
ofs.authlib ++ libXrdMacaroons.so

It works if I am not using multiuser plugin (read, dir listing, etc) - but does not work with multiuser. So this seems an issue on multiuser plugin:

Plugin loaded Multiuser v5.1.1 from fslib libXrdMultiuser-5.so
------ Initializing the multi-user plugin.
=====> multiuser.umask 0022
=====> ofs.authlib ++ libXrdAccSciTokens.so
=====> ofs.authlib ++ libXrdMacaroons.so
210323 14:54:23 3815349 multiuser_Config: Failed to load ++-5 ++-5: cannot open shared object file: No such file or directory
210323 14:54:23 3815349 multiuser_Initialize: Encountered a runtime failure: Failed to configure multi-user plugin.
210323 14:54:23 3815349 XrootdConfig: Unable to load file system via libXrdMultiuser.so
210323 14:54:23 3815349 XrootdConfig: Unable to load file system wrapper from libXrdMultiuser.so
------ xroot protocol initialization failed.
abh3 commented 3 years ago

Hi Justas,

Ah, I was not aware that multi-user reparses the configuration and is sensitive to those directives. Sigh, another thing that should not have been done. Now, I will need to find out why it was done that way in the first place.

Andy

On Tue, 23 Mar 2021, Justas Bal?as wrote:

ofs.authorize issue solved. Issue was due to new configuration syntax: Before:

ofs.authlib libXrdAccSciTokens.so libXrdMacaroons.so

After (good):

ofs.authlib ++ libXrdAccSciTokens.so
ofs.authlib ++ libXrdMacaroons.so

It works if I am not using multiuser plugin (read, dir listing, etc) - but does not work with multiuser. So this seems an issue on multiuser plugin:

Plugin loaded Multiuser v5.1.1 from fslib libXrdMultiuser-5.so
------ Initializing the multi-user plugin.
=====> multiuser.umask 0022
=====> ofs.authlib ++ libXrdAccSciTokens.so
=====> ofs.authlib ++ libXrdMacaroons.so
210323 14:54:23 3815349 multiuser_Config: Failed to load ++-5 ++-5: cannot open shared object file: No such file or directory
210323 14:54:23 3815349 multiuser_Initialize: Encountered a runtime failure: Failed to configure multi-user plugin.
210323 14:54:23 3815349 XrootdConfig: Unable to load file system via libXrdMultiuser.so
210323 14:54:23 3815349 XrootdConfig: Unable to load file system wrapper from libXrdMultiuser.so
------ xroot protocol initialization failed.

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/opensciencegrid/xrootd-multiuser/issues/14#issuecomment-805293444

djw8605 commented 3 years ago

This is fixed in #22