NVIDIA / proxyfs

Apache License 2.0
64 stars 25 forks source link

Connect ProxyFS to existing OpenStack Swift/Keystone installation #322

Open jnamdar opened 5 years ago

jnamdar commented 5 years ago

Hi,

Firstly, thanks for this application and for giving us the opportunity to use it.

To try it out, I deployed ProxyFS on a CentOS 7.4 VM using the Vagrantfile in the saio subfolder. By the way, the referenced vagrant box in this file seems to be down, I used this box (config.vm.box = "CentosBox/Centos-7-v7.4-Minimal-CLI" with the virtualbox provider to continue.

After the vagrant_provision.sh finished running, I compiled the ProxyFS project using make: everything went well. I then used the script start_and_mount_pfs to mount the NFS and SMB share. I can create folders/files in both shares without issues, and then view everything with the swift CLI:

[vagrant@localhost ~]$ ll /mnt/smb_proxyfs_mount/
total 0
drwxr-xr-x. 2 vagrant vagrant 0 Jun 17 16:25 test
drwxr-xr-x. 2 vagrant vagrant 0 Jun 14 15:48 test_container
drwxr-xr-x. 2 vagrant vagrant 0 Jun 14 15:56 test_container2
[vagrant@localhost ~]$ ll /mnt/smb_proxyfs_mount/test_container
total 0
-rwxr-xr-x. 1 vagrant vagrant 8 Jun 14 15:48 test_file.txt
[vagrant@localhost ~]$ cat /mnt/smb_proxyfs_mount/test_container/test_file.txt
abcdefg
[vagrant@localhost ~]$ swift -A http://127.0.0.1:8080/auth/v1.0 -U test:tester -K testing list
test
test_container
test_container2
[vagrant@localhost ~]$ swift -A http://127.0.0.1:8080/auth/v1.0 -U test:tester -K testing list test_container
test_file.txt
[vagrant@localhost ~]$ curl -i http://127.0.0.1:8080/v1/AUTH_test/test_container/test_file.txt -X GET -H "X-Auth-Token: AUTH_tka85032f655f249cca7d43b5c71184858"
HTTP/1.1 200 OK
Content-Length: 8
Accept-Ranges: bytes
Last-Modified: Fri, 14 Jun 2019 13:48:25 GMT
Etag: "pfsv2/AUTH_test/00000311/00000001-32"
X-Timestamp: 1560520104.65309
Content-Type: text/plain
X-Trans-Id: txb4100d18d9de43d094d35-005d08d72a
X-Openstack-Request-Id: txb4100d18d9de43d094d35-005d08d72a
Date: Tue, 18 Jun 2019 12:20:58 GMT

abcdefg

I've been looking for a way to use ProxyFS in my existing OpenStack Swift/Keystone installation:

So far, I have been able to deploy a CentOS 7.4 VM using the Vagrantfile in the saio subfolder. I removed everything regarding the installation of Swift (including the creation of the user swift) since I already have one installed.

I then fiddled with the ProxyFS configuration on this VM to point to my existing Swift Proxy server. I installed the pfs and meta middlewares on the machine hosting my Swift Proxy server, added them to the pipeline. I also launched another instance of the Proxy server listening on port 8090 with the /etc/swift/proxy-server/proxy-noauth.cond.d/20_settings.conf file: /usr/bin/python2 /usr/bin/swift-proxy-server /etc/swift/proxy-server/proxy-noauth.cond.d

Finally I used the script start_and_mount_pfs, after removing the lines about starting Swift, to launch ProxyFS and mount the NFS and SMB network shares.

The NFS share seems to work well (I can create folders and write files), but I'm getting an error trying to mount the SMB one. Relevant info: since I haven't created a swift user, I replaced it with the vagrant user that was already existing in the VM in the smb.conf file, and used smbpasswd -a vagrant. Command line error:

[vagrant@localhost ~]$ sudo mount -t cifs -o user=vagrant,uid=1000,gid=1000,vers=3.0,iocharset=utf8,actimeo=0 //127.0.0.1/proxyfs /mnt/smb_proxyfs_mount/
Password for vagrant@//127.0.0.1/proxyfs:  *******
mount error(5): Input/output error
Refer to the mount.cifs(8) manual page (e.g. man mount.cifs)

What I find in /var/log/samba/log.smbd after adding log level = 3 passdb:5 auth:5 in smb.conf:

[2019/06/18 13:41:48.820712,  3] ../lib/util/access.c:361(allow_access)
  Allowed connection from 127.0.0.1 (127.0.0.1)
[2019/06/18 13:41:48.821084,  3] ../source3/smbd/oplock.c:1322(init_oplocks)
  init_oplocks: initializing messages.
[2019/06/18 13:41:48.821353,  3] ../source3/smbd/process.c:1958(process_smb)
  Transaction 0 of length 106 (0 toread)
[2019/06/18 13:41:48.821806,  3] ../source3/smbd/smb2_negprot.c:290(smbd_smb2_request_process_negprot)
  Selected protocol SMB3_00
[2019/06/18 13:41:48.821849,  5] ../source3/auth/auth.c:491(make_auth_context_subsystem)
  Making default auth method list for server role = 'standalone server', encrypt passwords = yes
[2019/06/18 13:41:48.821873,  5] ../source3/auth/auth.c:48(smb_register_auth)
  Attempting to register auth backend trustdomain
[2019/06/18 13:41:48.821926,  5] ../source3/auth/auth.c:60(smb_register_auth)
  Successfully added auth method 'trustdomain'
[2019/06/18 13:41:48.821945,  5] ../source3/auth/auth.c:48(smb_register_auth)
  Attempting to register auth backend ntdomain
[2019/06/18 13:41:48.821956,  5] ../source3/auth/auth.c:60(smb_register_auth)
  Successfully added auth method 'ntdomain'
[2019/06/18 13:41:48.821970,  5] ../source3/auth/auth.c:48(smb_register_auth)
  Attempting to register auth backend guest
[2019/06/18 13:41:48.821983,  5] ../source3/auth/auth.c:60(smb_register_auth)
  Successfully added auth method 'guest'
[2019/06/18 13:41:48.821994,  5] ../source3/auth/auth.c:48(smb_register_auth)
  Attempting to register auth backend sam
[2019/06/18 13:41:48.822004,  5] ../source3/auth/auth.c:60(smb_register_auth)
  Successfully added auth method 'sam'
[2019/06/18 13:41:48.822015,  5] ../source3/auth/auth.c:48(smb_register_auth)
  Attempting to register auth backend sam_ignoredomain
[2019/06/18 13:41:48.822026,  5] ../source3/auth/auth.c:60(smb_register_auth)
  Successfully added auth method 'sam_ignoredomain'
[2019/06/18 13:41:48.822060,  5] ../source3/auth/auth.c:48(smb_register_auth)
  Attempting to register auth backend winbind
[2019/06/18 13:41:48.822076,  5] ../source3/auth/auth.c:60(smb_register_auth)
  Successfully added auth method 'winbind'
[2019/06/18 13:41:48.822086,  5] ../source3/auth/auth.c:378(load_auth_module)
  load_auth_module: Attempting to find an auth method to match guest
[2019/06/18 13:41:48.822099,  5] ../source3/auth/auth.c:403(load_auth_module)
  load_auth_module: auth method guest has a valid init
[2019/06/18 13:41:48.822110,  5] ../source3/auth/auth.c:378(load_auth_module)
  load_auth_module: Attempting to find an auth method to match sam
[2019/06/18 13:41:48.822122,  5] ../source3/auth/auth.c:403(load_auth_module)
  load_auth_module: auth method sam has a valid init
[2019/06/18 13:41:48.823791,  3] ../auth/gensec/gensec_start.c:918(gensec_register)
  GENSEC backend 'gssapi_spnego' registered
[2019/06/18 13:41:48.823830,  3] ../auth/gensec/gensec_start.c:918(gensec_register)
  GENSEC backend 'gssapi_krb5' registered
[2019/06/18 13:41:48.823904,  3] ../auth/gensec/gensec_start.c:918(gensec_register)
  GENSEC backend 'gssapi_krb5_sasl' registered
[2019/06/18 13:41:48.823935,  3] ../auth/gensec/gensec_start.c:918(gensec_register)
  GENSEC backend 'spnego' registered
[2019/06/18 13:41:48.823949,  3] ../auth/gensec/gensec_start.c:918(gensec_register)
  GENSEC backend 'schannel' registered
[2019/06/18 13:41:48.823964,  3] ../auth/gensec/gensec_start.c:918(gensec_register)
  GENSEC backend 'naclrpc_as_system' registered
[2019/06/18 13:41:48.823976,  3] ../auth/gensec/gensec_start.c:918(gensec_register)
  GENSEC backend 'sasl-EXTERNAL' registered
[2019/06/18 13:41:48.823988,  3] ../auth/gensec/gensec_start.c:918(gensec_register)
  GENSEC backend 'ntlmssp' registered
[2019/06/18 13:41:48.824000,  3] ../auth/gensec/gensec_start.c:918(gensec_register)
  GENSEC backend 'ntlmssp_resume_ccache' registered
[2019/06/18 13:41:48.824014,  3] ../auth/gensec/gensec_start.c:918(gensec_register)
  GENSEC backend 'http_basic' registered
[2019/06/18 13:41:48.824030,  3] ../auth/gensec/gensec_start.c:918(gensec_register)
  GENSEC backend 'http_ntlm' registered
[2019/06/18 13:41:48.824789,  5] ../source3/auth/auth.c:491(make_auth_context_subsystem)
  Making default auth method list for server role = 'standalone server', encrypt passwords = yes
[2019/06/18 13:41:48.824822,  5] ../source3/auth/auth.c:378(load_auth_module)
  load_auth_module: Attempting to find an auth method to match guest
[2019/06/18 13:41:48.824836,  5] ../source3/auth/auth.c:403(load_auth_module)
  load_auth_module: auth method guest has a valid init
[2019/06/18 13:41:48.824847,  5] ../source3/auth/auth.c:378(load_auth_module)
  load_auth_module: Attempting to find an auth method to match sam
[2019/06/18 13:41:48.824859,  5] ../source3/auth/auth.c:403(load_auth_module)
  load_auth_module: auth method sam has a valid init
[2019/06/18 13:41:48.825052,  3] ../auth/ntlmssp/ntlmssp_util.c:69(debug_ntlmssp_flags)
  Got NTLMSSP neg_flags=0xa0080205
[2019/06/18 13:41:48.825484,  3] ../auth/ntlmssp/ntlmssp_server.c:452(ntlmssp_server_preauth)
  Got user=[vagrant] domain=[LOCALHOST] workstation=[] len1=0 len2=132
[2019/06/18 13:41:48.825565,  3] ../source3/param/loadparm.c:3823(lp_load_ex)
  lp_load_ex: refreshing parameters
[2019/06/18 13:41:48.825665,  3] ../source3/param/loadparm.c:542(init_globals)
  Initialising global parameters
[2019/06/18 13:41:48.825810,  3] ../source3/param/loadparm.c:2752(lp_do_section)
  Processing section "[global]"
[2019/06/18 13:41:48.825983,  2] ../source3/param/loadparm.c:2769(lp_do_section)
  Processing section "[proxyfs]"
[2019/06/18 13:41:48.826162,  3] ../source3/param/loadparm.c:1592(lp_add_ipc)
  adding IPC service
[2019/06/18 13:41:48.826198,  5] ../source3/auth/auth_util.c:123(make_user_info_map)
  Mapping user [LOCALHOST]\[vagrant] from workstation []
[2019/06/18 13:41:48.826220,  5] ../source3/auth/user_info.c:62(make_user_info)
  attempting to make a user_info for vagrant (vagrant)
[2019/06/18 13:41:48.826236,  5] ../source3/auth/user_info.c:70(make_user_info)
  making strings for vagrant's user_info struct
[2019/06/18 13:41:48.826244,  5] ../source3/auth/user_info.c:108(make_user_info)
  making blobs for vagrant's user_info struct
[2019/06/18 13:41:48.826251,  3] ../source3/auth/auth.c:178(auth_check_ntlm_password)
  check_ntlm_password:  Checking password for unmapped user [LOCALHOST]\[vagrant]@[] with the new password interface
[2019/06/18 13:41:48.826259,  3] ../source3/auth/auth.c:181(auth_check_ntlm_password)
  check_ntlm_password:  mapped user is: [LOCALHOST]\[vagrant]@[]
[2019/06/18 13:41:48.826554,  3] ../source3/passdb/lookup_sid.c:1680(get_primary_group_sid)
  Forcing Primary Group to 'Domain Users' for vagrant
[2019/06/18 13:41:48.826646,  4] ../source3/auth/check_samsec.c:183(sam_account_ok)
  sam_account_ok: Checking SMB password for user vagrant
[2019/06/18 13:41:48.826661,  5] ../source3/auth/check_samsec.c:165(logon_hours_ok)
  logon_hours_ok: user vagrant allowed to logon at this time (Tue Jun 18 11:41:48 2019
  )
[2019/06/18 13:41:48.827099,  5] ../source3/auth/server_info_sam.c:122(make_server_info_sam)
  make_server_info_sam: made server info for user vagrant -> vagrant
[2019/06/18 13:41:48.827130,  3] ../source3/auth/auth.c:249(auth_check_ntlm_password)
  check_ntlm_password: sam authentication for user [vagrant] succeeded
[2019/06/18 13:41:48.827153,  5] ../source3/auth/auth.c:292(auth_check_ntlm_password)
  check_ntlm_password:  PAM Account for user [vagrant] succeeded
[2019/06/18 13:41:48.827160,  2] ../source3/auth/auth.c:305(auth_check_ntlm_password)
  check_ntlm_password:  authentication for user [vagrant] -> [vagrant] -> [vagrant] succeeded
[2019/06/18 13:41:48.827343,  3] ../source3/auth/token_util.c:548(finalize_local_nt_token)
  Failed to fetch domain sid for WORKGROUP
[2019/06/18 13:41:48.827371,  3] ../source3/auth/token_util.c:580(finalize_local_nt_token)
  Failed to fetch domain sid for WORKGROUP
[2019/06/18 13:41:48.827624,  5] ../source3/passdb/pdb_interface.c:1749(lookup_global_sam_rid)
  lookup_global_sam_rid: looking up RID 513.
[2019/06/18 13:41:48.827655,  5] ../source3/passdb/pdb_tdb.c:658(tdbsam_getsampwrid)
  pdb_getsampwrid (TDB): error looking up RID 513 by key RID_00000201.
[2019/06/18 13:41:48.827672,  5] ../source3/passdb/pdb_interface.c:1825(lookup_global_sam_rid)
  Can't find a unix id for an unmapped group
[2019/06/18 13:41:48.827679,  5] ../source3/passdb/pdb_interface.c:1535(pdb_default_sid_to_id)
  SID S-1-5-21-2240567756-3470875878-3910347872-513 belongs to our domain, but there is no corresponding object in the database.
[2019/06/18 13:41:48.827699,  5] ../source3/passdb/pdb_interface.c:1749(lookup_global_sam_rid)
  lookup_global_sam_rid: looking up RID 513.
[2019/06/18 13:41:48.827711,  5] ../source3/passdb/pdb_tdb.c:658(tdbsam_getsampwrid)
  pdb_getsampwrid (TDB): error looking up RID 513 by key RID_00000201.
[2019/06/18 13:41:48.827723,  5] ../source3/passdb/pdb_interface.c:1825(lookup_global_sam_rid)
  Can't find a unix id for an unmapped group
[2019/06/18 13:41:48.827729,  5] ../source3/passdb/pdb_interface.c:1535(pdb_default_sid_to_id)
  SID S-1-5-21-2240567756-3470875878-3910347872-513 belongs to our domain, but there is no corresponding object in the database.
[2019/06/18 13:41:48.827829,  3] ../source3/smbd/password.c:144(register_homes_share)
  Adding homes service for user 'vagrant' using home directory: '/home/vagrant'
[2019/06/18 13:41:48.828148,  3] ../lib/util/access.c:361(allow_access)
  Allowed connection from 127.0.0.1 (127.0.0.1)
[2019/06/18 13:41:48.828191,  3] ../libcli/security/dom_sid.c:210(dom_sid_parse_endp)
  string_to_sid: SID vagrant is not in a valid format
[2019/06/18 13:41:48.828274,  3] ../source3/passdb/lookup_sid.c:1680(get_primary_group_sid)
  Forcing Primary Group to 'Domain Users' for vagrant
[2019/06/18 13:41:48.828374,  3] ../source3/smbd/service.c:576(make_connection_snum)
  Connect path is '/mnt/CommonVolume' for service [proxyfs]
[2019/06/18 13:41:48.828407,  3] ../libcli/security/dom_sid.c:210(dom_sid_parse_endp)
  string_to_sid: SID vagrant is not in a valid format
[2019/06/18 13:41:48.828483,  3] ../source3/passdb/lookup_sid.c:1680(get_primary_group_sid)
  Forcing Primary Group to 'Domain Users' for vagrant
[2019/06/18 13:41:48.828562,  3] ../source3/smbd/vfs.c:113(vfs_init_default)
  Initialising default vfs hooks
[2019/06/18 13:41:48.828589,  3] ../source3/smbd/vfs.c:139(vfs_init_custom)
  Initialising custom vfs hooks from [/[Default VFS]/]
[2019/06/18 13:41:48.828598,  3] ../source3/smbd/vfs.c:139(vfs_init_custom)
  Initialising custom vfs hooks from [proxyfs]
[2019/06/18 13:41:48.831109,  2] ../lib/util/modules.c:196(do_smb_load_module)
  Module 'proxyfs' loaded
[2019/06/18 13:41:48.834266,  1] vfs_proxyfs.c:230(vfs_proxyfs_connect)
  proxyfs_mount_failed: Volume : CommonVolume Connection_path /mnt/CommonVolume Service proxyfs user vagrant errno 19
[2019/06/18 13:41:48.834293,  1] ../source3/smbd/service.c:636(make_connection_snum)
  make_connection_snum: SMB_VFS_CONNECT for service 'proxyfs' at '/mnt/CommonVolume' failed: No such device
[2019/06/18 13:41:48.834344,  3] ../source3/smbd/smb2_server.c:3097(smbd_smb2_request_error_ex)
  smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[1] status[NT_STATUS_UNSUCCESSFUL] || at ../source3/smbd/smb2_tcon.c:135
[2019/06/18 13:41:48.960403,  3] ../source3/smbd/server_exit.c:246(exit_server_common)
  Server exit (NT_STATUS_END_OF_FILE)
[2019/06/18 13:41:48.966933,  3] ../source3/lib/util_procid.c:54(pid_to_procid)
  pid_to_procid: messaging_dgm_get_unique failed: No such file or directory

It looks like the samba authentication went well, but the relevant error to me are the following lines:

[2019/06/18 13:41:48.831109,  2] ../lib/util/modules.c:196(do_smb_load_module)
  Module 'proxyfs' loaded
[2019/06/18 13:41:48.834266,  1] vfs_proxyfs.c:230(vfs_proxyfs_connect)
  proxyfs_mount_failed: Volume : CommonVolume Connection_path /mnt/CommonVolume Service proxyfs user vagrant errno 19
[2019/06/18 13:41:48.834293,  1] ../source3/smbd/service.c:636(make_connection_snum)
  make_connection_snum: SMB_VFS_CONNECT for service 'proxyfs' at '/mnt/CommonVolume' failed: No such device

I tried troubleshooting this, but no luck so far. Would anyone be able to help on this? Here's my df -H output if needed:

Filesystem                   Size  Used Avail Use% Mounted on
/dev/mapper/cl-root           19G  3.1G   16G  17% /
devtmpfs                     3.1G     0  3.1G   0% /dev
tmpfs                        3.1G     0  3.1G   0% /dev/shm
tmpfs                        3.1G  9.0M  3.1G   1% /run
tmpfs                        3.1G     0  3.1G   0% /sys/fs/cgroup
/dev/sda1                    1.1G  240M  824M  23% /boot
tmpfs                        609M     0  609M   0% /run/user/1000
CommonMountPoint             110T     0  110T   0% /CommonMountPoint
127.0.0.1:/CommonMountPoint  110T     0  110T   0% /mnt/nfs_proxyfs_mount

I also tried to get containers and objects I created via the NFS share with the Object Storage API, but I got the following error on my Swift Proxy server:

[root@controller adminuser]# swift -A http://controller:8080/auth/v1.0 -U test:tester -K testing stat --debug
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): controller:8080
DEBUG:urllib3.connectionpool:http://controller:8080 "GET /auth/v1.0 HTTP/1.1" 200 0
DEBUG:swiftclient:REQ: curl -i http://controller:8080/auth/v1.0 -X GET
DEBUG:swiftclient:RESP STATUS: 200 OK
DEBUG:swiftclient:RESP HEADERS: {u'Content-Length': u'0', u'X-Trans-Id': u'tx6493625ff99f4486a7f5b-005d08d170', u'X-Auth-Token-Expires': u'76663', u'X-Auth-Token': u'AUTH_tk24c8619d99964285a356cbf294531184', u'X-Storage-Token': u'AUTH_tk24c8619d99964285a356cbf294531184', u'Date': u'Tue, 18 Jun 2019 11:56:32 GMT', u'X-Storage-Url': u'http://controller:8080/v1/AUTH_test', u'Content-Type': u'text/html; charset=UTF-8', u'X-Openstack-Request-Id': u'tx6493625ff99f4486a7f5b-005d08d170'}
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): controller:8080
DEBUG:urllib3.connectionpool:http://controller:8080 "HEAD /v1/AUTH_test HTTP/1.1" 500 0
INFO:swiftclient:REQ: curl -i http://controller:8080/v1/AUTH_test -I -H "X-Auth-Token: AUTH_tk24c8619d99964285a356cbf294531184"
INFO:swiftclient:RESP STATUS: 500 Internal Error
INFO:swiftclient:RESP HEADERS: {u'Date': u'Tue, 18 Jun 2019 11:56:32 GMT', u'Content-Length': u'17', u'Content-Type': u'text/plain', u'X-Openstack-Request-Id': u'tx3bdf2145377d4050a7044-005d08d170', u'X-Trans-Id': u'tx3bdf2145377d4050a7044-005d08d170'}

Relevant lines in /var/log/messages regarding the error:

Jun 18 13:57:57 controller proxy-server: STDERR: (23786) accepted ('192.168.71.37', 52024)
Jun 18 13:57:57 controller proxy-server: - - 18/Jun/2019/11/57/57 HEAD /auth/v1.0 HTTP/1.0 400 - Swift - - - - tx1c9994434391428a82261-005d08d1c5 - 0.0002 RL - 1560859077.899780035 1560859077.899970055 -
Jun 18 13:57:57 controller proxy-server: 192.168.71.37 192.168.71.37 18/Jun/2019/11/57/57 GET /auth/v1.0 HTTP/1.0 200 - python-swiftclient-3.6.0 - - - - tx1c9994434391428a82261-005d08d1c5 - 0.0021 - - 1560859077.899091005 1560859077.901160955 -
Jun 18 13:57:57 controller proxy-server: STDERR: 192.168.71.37 - - [18/Jun/2019 11:57:57] "GET /auth/v1.0 HTTP/1.1" 200 417 0.002583 (txn: tx1c9994434391428a82261-005d08d1c5)
Jun 18 13:57:57 controller proxy-server: STDERR: (23786) accepted ('192.168.71.37', 52026)
Jun 18 13:57:57 controller proxy-server: 192.168.71.37 192.168.71.37 18/Jun/2019/11/57/57 HEAD /v1/AUTH_test%3Fformat%3Djson HTTP/1.0 500 - python-swiftclient-3.6.0 AUTH_tk24c8619d9... - - - txc715be53ba9e476483a71-005d08d1c5 - 0.0013 - - 1560859077.906188011 1560859077.907531023 -
Jun 18 13:57:57 controller proxy-server: Erreur : une erreur s'est produite: Hôte inaccessible (txn: txc715be53ba9e476483a71-005d08d1c5)
Jun 18 13:57:57 controller proxy-server: STDERR: 192.168.71.37 - - [18/Jun/2019 11:57:57] "HEAD /v1/AUTH_test HTTP/1.1" 500 222 0.001975 (txn: txc715be53ba9e476483a71-005d08d1c5)

On another subject, does ProxyFS support Keystone authentication, instead of the tempauth used in the main pipeline?

More broadly, has anyone tried to connect ProxyFS to an existing OpenStack Swift/Keystone installation?

Regards

edmc-ss commented 5 years ago

Welcome to ProxyFS, jnamdar!

Apologies about the lack of the CentOS box... indeed, I believe the path to it needs tweaking. That particular box has the vbox tools installed allowing "synced folders" to work. I trust you found a workable substitute. In any event, we should be updating the CentOS box version we are referencing.

Nice to see you were able to get your Swift Proxy "NoAuth" pipeline up and working. One caution about that pipeline. As you can imagine, it doesn't want to have the pfs_middleware installed (but does need meta_middleware installed)... the reverse of the "normal" Swift Proxy. Another caution... if you enable encryption in your cluster, you need to be sure the encryption middleware is in both pipelines (and after pfs_middleware in the "normal" pipeline).

Keystone Auth for the Swift API pipeline should "just work"... but I must admit to never having tried it. The pfs_middleware that enables ProxyFS sits after the Auth step in the pipeline, so we should be just fine. But if you are asking about Keystone Auth being integrated on the SMB side, I don't have any experience with that alas. Is that the need for you?

Your attempt to get SMB mounting working to your modified setup is very interesting. I was first thinking that one of the authorizing steps was missed so I tried a couple of things.

1) Samba, due to its history of leveraging Linux "authorization" by mapping SMB users to Linux users, means that the SMB user needs to also exist in the Linux user database. I noted, however, that your smbpasswd -a step would have failed if it didn't.

2) I then note that if you haven't added your SMB user to the valid users = line in /etc/samba/smb.conf - and restarted or otherwise "triggered" a refresh - it won't (yet) know about your added SMB user. Alas, what you'd get in this case is a mount error(13): Permission denied, not the mount error(5): Input/output error you received.

What I note has usually happened when getting the error you received I just re-verified. I sent a SIGKILL to the 'proxyfsdprocess. Without something restarting it, the mount will fail in precisely the way you see... and I strongly suspect that has happened. What's going on there is Samba has been configured (for this volume anyway) to invoke the vfs plug-in included in a couple of submodules of ProxyFS (I believe you must have done agit submodule update --recursive` at some point, so you should be fine). Anyway, the jrpcclient submodule is actually the one connecting to the proxyfsd process over a couple of TCP ports (the supplied .conf's have those as ports 12345 & 32345 I believe) on the PrivateIPAddr interface. Port 32345 is used for the read and write data path (only coming from the adjacent on the node Samba instance in your case)... while port 12345 is for all other RPCs... coming from both the adjacent Samba instance as well as all of the pfs_middleware instances in your collection of Swift Proxy nodes.

Anyway, I'm thinking the proxyfsd process has failed. It would be very interesting to see the log from that process (/var/log/proxyfsd/proxyfsd.log).

Next, it looks like you got a "500 Internal Error" when later attempting a Swift API method targeting this Swift Account. This makes total sense. What's going on here is there is a Header in the Swift Account that indicates this is a so-called "BiModal" Account... meaning that ProxyFS manages its content. In your proxy-server.conf's [filter:pfs] section, there are instructions for which proxyfsd instance (IPAddr & Port#) to contact (note the above comments about TCP Port 12345). The pfs_middleware attempts to contact this proxyfsd_host (this can actually be a list btw) asking ProxyFS for the IPAddr of the proxyfsd instance actually serving the Volume. In this case, it should be told "hey, yes, it's me - you've come to the right place". Anyway, it tries really hard to contact the proxyfsd process indicated...and if it fails, you'll see this "500 Internal Error".

So with that, I believe all signs point to a failure of your proxyfsd process. If you can post the log file I mentioned, perhaps we can get to the bottom of this quickly.

Now just one more thing to check out. As you noted, you needed to configure what we call the "NoAuth" Swift Proxy instance adjacent to your proxyfsd process. It should be serving localhost:8090 I believe. If you log onto the "node" running proxyfsd, you should be able to talk to the underlying Swift Account. Just make sure this is possible. Do a HEAD on the "BiModal" Swift Account... you should see the BiModal Header I mentioned earlier indicating it is, in fact, BiModal. If you do a GET on it, you should see at least one Container likely named ".checkpoint". This is the Container that receives all of the file system's metadata for that Volume in the form of a "write log" consisting of LogSegment objects numbered in ascending order (starting with 0000000000000002 or so). There will be holes as the same 64-bit "Nonce" number sequence is used for all sorts of things that must be named uniquely (e.g. Inode#'s, Container names, etc...).

If you don't see the BiModal Header on the Account... or the .checkpoint Container in the Account, then what it sounds like to me is the Volume needs to be formatted. If you look in the start_and_mount_pfs script, you should see a function named format_volume_if_necessary that invokes a tool called mkproxyfs. It takes just a few options... this script uses "-I" to say "format it if it's not already formatted". That's probably what you want... There are other options to say either "only format if the Account is empty" or "hey, empty it first before formatting" (obviously dangerous).

One note about formatting. ProxyFS requires a bit more guarantee than an Eventually Consistent storage system such as Swift provides. It cannot handle getting back "stale" information that an EC system may supply if, say, you did a 2nd PUT of an Object (a subsequent GET could return nothing, the first version, or the 2nd version). To avoid this, ProxyFS never PUTs the same Object (or Container) twice. That's why I mentioned the "Nonce" above. As such, the only way it could "know" that it had never used a given Nonce Value before is if the Account starts out completely empty. So just a heads up that you should either never reuse an Account... or empty if first (e.g. with mkproxyfs -F).

Hope all of this discussion gives you things to look at to get your system running again. I don't know which of these are going to help identify the issue... but hopefully at least one of them will.

And, again, welcome to ProxyFS!

jnamdar commented 5 years ago

Hello, thank you @edmc-ss for the thorough answer.

Here is the output of my proxyfsd.log :

time="2019-06-21T13:42:21.915145+02:00" level=warning msg="config variable 'TrackedLock.LockHoldTimeLImit' defaulting to '0s': [TrackedLock] missing" function=parseConfMap goroutine=8 package=trackedlock pid=4773
time="2019-06-21T13:42:21.915240+02:00" level=warning msg="config variable 'TrackedLock.LockCheckPeriod' defaulting to '0s': [TrackedLock] missing" function=parseConfMap goroutine=8 package=trackedlock pid=4773
time="2019-06-21T13:42:21.915302+02:00" level=info msg="trackedlock pkg: LockHoldTimeLimit 0 sec  LockCheckPeriod 0 sec" function=parseConfMap goroutine=8 package=trackedlock pid=4773
time="2019-06-21T13:42:21.915698+02:00" level=info msg="evtlog.Up(): event logging is false" function=Up goroutine=8 package=evtlog pid=4773
time="2019-06-21T13:42:21.917613+02:00" level=info msg="SwiftClient.RetryLimit 11, SwiftClient.RetryDelay 1.000 sec, SwiftClient.RetryExpBackoff 1.5" function=reloadConfig goroutine=8 package=swiftclient pid=4773
time="2019-06-21T13:42:21.917694+02:00" level=info msg="SwiftClient.RetryLimitObject 8, SwiftClient.RetryDelayObject 1.000 sec, SwiftClient.RetryExpBackoffObject 1.9" function=reloadConfig goroutine=8 package=swiftclient pid=4773
time="2019-06-21T13:42:21.917780+02:00" level=info msg="SwiftClient.ChecksumChunkedPutChunks disabled\n" function=reloadConfig goroutine=8 package=swiftclient pid=4773
time="2019-06-21T13:42:21.918737+02:00" level=info msg="Transitions Package Registration List: [logger trackedlock dlm evtlog stats swiftclient headhunter halter inode fs fuse jrpcfs statslogger liveness httpserver]" function=up gor
outine=8 package=transitions pid=4773
time="2019-06-21T13:42:21.920466+02:00" level=info msg="ChunkedFreeConnections: min=512 mean=512 max=512  NonChunkedFreeConnections: min=127 mean=127 max=127" function=logStats goroutine=18 package=statslogger pid=4773
time="2019-06-21T13:42:21.920542+02:00" level=info msg="Memory in Kibyte (total): Sys=68288 StackSys=352 MSpanSys=32 MCacheSys=16 BuckHashSys=3 GCSys=2182 OtherSys=518" function=logStats goroutine=18 package=statslogger pid=4773
time="2019-06-21T13:42:21.920605+02:00" level=info msg="Memory in Kibyte (total): HeapInuse=2552 HeapIdle=62632 HeapReleased=0 Cumulative TotalAlloc=1951" function=logStats goroutine=18 package=statslogger pid=4773
time="2019-06-21T13:42:21.920659+02:00" level=info msg="GC Stats (total): NumGC=0  NumForcedGC=0  NextGC=4369 KiB  PauseTotalMsec=0  GC_CPU=0.00%" function=logStats goroutine=18 package=statslogger pid=4773
time="2019-06-21T13:42:21.920711+02:00" level=info msg="Swift Client Ops (total): Account QueryOps=0 ModifyOps=0 Container QueryOps=0 ModifyOps=0 Object QueryOps=0 ModifyOps=0" function=logStats goroutine=18 package=statslogger pid=4773
time="2019-06-21T13:42:21.920784+02:00" level=info msg="Swift Client ChunkedPut Ops (total): FetchOps=0 ReadOps=0 SendOps=0 CloseOps=0" function=logStats goroutine=18 package=statslogger pid=4773
time="2019-06-21T13:42:21.963298+02:00" level=info msg="Inode cache discard ticker for 'volume: CommonVolume' is: 1s MaxBytesInodeCache: 10485760" function=startInodeCacheDiscard goroutine=8 package=inode pid=4773
time="2019-06-21T13:42:21.963768+02:00" level=info msg="Adopting ReadCache Parameters..." function=adoptVolumeGroupReadCacheParameters goroutine=8 package=inode pid=4773
time="2019-06-21T13:42:21.963852+02:00" level=info msg="...ReadCacheQuotaFraction(0.2) of memSize(0x000000016AF16000) totals 0x000000000742447A" function=adoptVolumeGroupReadCacheParameters goroutine=8 package=inode pid=4773
time="2019-06-21T13:42:21.963924+02:00" level=info msg="...0x00000074 cache lines (each of size 0x00100000) totalling 0x0000000007400000 for Volume Group CommonVolumeGroup" function=adoptVolumeGroupReadCacheParameters goroutine=8 package=inode pid=4773
time="2019-06-21T13:42:21.963990+02:00" level=info msg="Checkpoint per Flush for volume CommonVolume is true" function=ServeVolume goroutine=8 package=fs pid=4773
time="2019-06-21T13:42:21.965235+02:00" level=warning msg="Couldn't mount CommonVolume.FUSEMountPoint == CommonMountPoint" error="fusermount: exit status 1" function=performMount goroutine=8 package=fuse
time="2019-06-21T13:42:21.970704+02:00" level=warning msg="config variable 'TrackedLock.LockHoldTimeLImit' defaulting to '0s': [TrackedLock] missing" function=parseConfMap goroutine=8 package=trackedlock pid=4773
time="2019-06-21T13:42:21.970803+02:00" level=warning msg="config variable 'TrackedLock.LockCheckPeriod' defaulting to '0s': [TrackedLock] missing" function=parseConfMap goroutine=8 package=trackedlock pid=4773
time="2019-06-21T13:42:21.971845+02:00" level=info msg="trackedlock pkg: LockHoldTimeLimit 0 sec  LockCheckPeriod 0 sec" function=parseConfMap goroutine=8 package=trackedlock pid=4773
time="2019-06-21T13:42:21.971963+02:00" level=info msg="evtlog.Signaled(): event logging is now false (was false)" function=SignaledFinish goroutine=8 package=evtlog pid=4773
time="2019-06-21T13:42:21.972334+02:00" level=info msg="transitions.Up() returning successfully" function=func1 goroutine=8 package=transitions pid=4773
time="2019-06-21T13:42:21.972410+02:00" level=info msg="proxyfsd is starting up (version 1.10.1.0.2-5-g16af550) (PID 4773); invoked as '/vagrant/bin/proxyfsd' '/vagrant/src/github.com/swiftstack/ProxyFS/saio/proxyfs.conf'" function=Daemon goroutine=8 package=proxyfsd pid=4773
time="2019-06-21T13:42:22.177267+02:00" level=warning msg="config variable 'TrackedLock.LockHoldTimeLImit' defaulting to '0s': [TrackedLock] missing" function=parseConfMap goroutine=1 package=trackedlock pid=4868
time="2019-06-21T13:42:22.177397+02:00" level=warning msg="config variable 'TrackedLock.LockCheckPeriod' defaulting to '0s': [TrackedLock] missing" function=parseConfMap goroutine=1 package=trackedlock pid=4868
time="2019-06-21T13:42:22.177453+02:00" level=info msg="trackedlock pkg: LockHoldTimeLimit 0 sec  LockCheckPeriod 0 sec" function=parseConfMap goroutine=1 package=trackedlock pid=4868
time="2019-06-21T13:42:22.177890+02:00" level=info msg="evtlog.Up(): event logging is false" function=Up goroutine=1 package=evtlog pid=4868
time="2019-06-21T13:42:22.179261+02:00" level=info msg="SwiftClient.RetryLimit 1, SwiftClient.RetryDelay 1.000 sec, SwiftClient.RetryExpBackoff 1.5" function=reloadConfig goroutine=1 package=swiftclient pid=4868
time="2019-06-21T13:42:22.179325+02:00" level=info msg="SwiftClient.RetryLimitObject 8, SwiftClient.RetryDelayObject 1.000 sec, SwiftClient.RetryExpBackoffObject 1.9" function=reloadConfig goroutine=1 package=swiftclient pid=4868
time="2019-06-21T13:42:22.179377+02:00" level=info msg="SwiftClient.ChecksumChunkedPutChunks disabled\n" function=reloadConfig goroutine=1 package=swiftclient pid=4868
time="2019-06-21T13:42:22.179498+02:00" level=info msg="Transitions Package Registration List: [logger trackedlock evtlog stats swiftclient headhunter]" function=up goroutine=1 package=transitions pid=4868
time="2019-06-21T13:42:22.180059+02:00" level=warning msg="config variable 'TrackedLock.LockHoldTimeLImit' defaulting to '0s': [TrackedLock] missing" function=parseConfMap goroutine=1 package=trackedlock pid=4868
time="2019-06-21T13:42:22.180117+02:00" level=warning msg="config variable 'TrackedLock.LockCheckPeriod' defaulting to '0s': [TrackedLock] missing" function=parseConfMap goroutine=1 package=trackedlock pid=4868
time="2019-06-21T13:42:22.180163+02:00" level=info msg="trackedlock pkg: LockHoldTimeLimit 0 sec  LockCheckPeriod 0 sec" function=parseConfMap goroutine=1 package=trackedlock pid=4868
time="2019-06-21T13:42:22.180212+02:00" level=info msg="evtlog.Signaled(): event logging is now false (was false)" function=SignaledFinish goroutine=1 package=evtlog pid=4868
time="2019-06-21T13:42:22.180258+02:00" level=info msg="transitions.Up() returning successfully" function=func1 goroutine=1 package=transitions pid=4868
time="2019-06-21T13:42:22.180296+02:00" level=info msg="mkproxyfs is starting up (version 1.10.1.0.2-5-g16af550) (PID 4868); invoked as '/vagrant/bin/mkproxyfs' '-I' 'CommonVolume' '/vagrant/src/github.com/swiftstack/ProxyFS/saio/proxyfs.conf' 'SwiftClient.RetryLimit=1'" function=Format goroutine=1 package=mkproxyfs pid=4868
time="2019-06-21T13:42:22.198627+02:00" level=info msg="transitions.Down() called" function=down goroutine=1 package=transitions pid=4868
time="2019-06-21T13:42:22.198869+02:00" level=info msg="SwiftClient.RetryLimit 1, SwiftClient.RetryDelay 1.000 sec, SwiftClient.RetryExpBackoff 1.5" function=reloadConfig goroutine=1 package=swiftclient pid=4868
time="2019-06-21T13:42:22.199055+02:00" level=info msg="SwiftClient.RetryLimitObject 8, SwiftClient.RetryDelayObject 1.000 sec, SwiftClient.RetryExpBackoffObject 1.9" function=reloadConfig goroutine=1 package=swiftclient pid=4868
time="2019-06-21T13:42:22.199246+02:00" level=info msg="SwiftClient.ChecksumChunkedPutChunks disabled\n" function=reloadConfig goroutine=1 package=swiftclient pid=4868
time="2019-06-21T13:42:22.200547+02:00" level=info msg="tracklock.Down() called" function=Down goroutine=1 package=trackedlock pid=4868

This line seems relevant :

time="2019-06-21T13:42:21.965235+02:00" level=warning msg="Couldn't mount CommonVolume.FUSEMountPoint == CommonMountPoint" error="fusermount: exit status 1" function=performMount goroutine=8 package=fuse

Edit: I take that back, this error doesn't appear anymore (I must've messed with something in the configuration).

I formatted the volume by calling mkproxyfs -F like you said, it did actually flush it (the mounted volume became empty, it had a folder with a file before). I then recreated this folder and copied a file in it.

I do have a .__checkpoint__ container in the account, it looks like this with a curl (controller is the machine hosting my proxy) :

[root@controller swift]# curl http://controller:8090/v1/AUTH_test/.__checkpoint__/
0000000000000002
0000000000000067
000000000000007A

Regards

Edit2: From the error

[2019/06/18 13:41:48.831109,  2] ../lib/util/modules.c:196(do_smb_load_module)
  Module 'proxyfs' loaded
[2019/06/18 13:41:48.834266,  1] vfs_proxyfs.c:230(vfs_proxyfs_connect)
  proxyfs_mount_failed: Volume : CommonVolume Connection_path /mnt/CommonVolume Service proxyfs user vagrant errno 19
[2019/06/18 13:41:48.834293,  1] ../source3/smbd/service.c:636(make_connection_snum)
  make_connection_snum: SMB_VFS_CONNECT for service 'proxyfs' at '/mnt/CommonVolume' failed: No such device

, I get that the function proxyfs_mount from the jrpc-client failed when trying to mount the samba share. How would I go on troubleshooting it?

I also dug further on the error I get when executing [root@controller adminuser]# swift -A http://controller:8080/auth/v1.0 -U test:tester -K testing stat --debug. The 12345 port was firewall blocked on the VM with the proxyfsd service, which would explain the "unreachable host" error.

I fixed that but now, when trying to get a list of the containers in the test account, it's failing with the resulting error stack:

Jun 21 19:08:31 controller proxy-server: - - 21/Jun/2019/17/08/31 HEAD /auth/v1.0 HTTP/1.0 400 - Swift - - - - tx16277828dba2402fadaae-005d0d0f0f - 0.0004 RL - 1561136911.990171909 1561136911.990561962 -
Jun 21 19:08:31 controller proxy-server: 192.168.71.37 192.168.71.37 21/Jun/2019/17/08/31 GET /auth/v1.0 HTTP/1.0 200 - python-swiftclient-3.6.0 - - - - tx16277828dba2402fadaae-005d0d0f0f - 0.0038 - - 1561136911.988802910 1561136911
.992594004 -
Jun 21 19:08:31 controller proxy-server: STDERR: 192.168.71.37 - - [21/Jun/2019 17:08:31] "GET /auth/v1.0 HTTP/1.1" 200 417 0.004867 (txn: tx16277828dba2402fadaae-005d0d0f0f)
Jun 21 19:08:31 controller proxy-server: STDERR: (19870) accepted ('192.168.71.37', 43720)
Jun 21 19:08:32 controller proxy-server: 192.168.71.37 192.168.71.37 21/Jun/2019/17/08/31 GET /v1/AUTH_test%3Fformat%3Djson HTTP/1.0 500 - python-swiftclient-3.6.0 AUTH_tka5e664b10... - - - tx9f625f0b9b6a4c65962c8-005d0d0f0f - 0.002
8 - - 1561136911.997173071 1561136911.999938011 -
Jun 21 19:08:32 controller proxy-server: Erreur : une erreur s'est produite: Connexion refusée (txn: tx9f625f0b9b6a4c65962c8-005d0d0f0f)

, which basically means "connection refused". I'm not sure where the connection is trying to be made though...

jnamdar commented 5 years ago

I'm trying to understand how ProxyFS stores objects. Reading this page, I get that when I create an object using Filesystem access (via the Samba share for instance), ProxyFS uses exclusively the NoAuth pipeline. This pipeline includes the meta middleware in order to update the account's metadata (in the .__checkpoint__ container I'm guessing?).

The other pipeline (the usual one) seems to be only used when trying to access objects the usual way, and the pfs middleware allows us to request objects in a ProxyFS-managed account.

If I'm right about these pipelines' roles, where exactly would authentication play a part when writing/reading object via Filesystem Access? I don't see how adding Keystone authentication to the usual pipeline would help since it's not used by Filesystem Access.

Do you think steps such as asking for a Keystone token, and adding it to every request's header would have to be directly implemented in the SMB VFS/jrpcclient/jrpcfs/swiftclient layers? Ideally the user would provide credentials such as projectname:username/password when mounting the samba share, and those credentials would be sent through every layer to ask for the token.

Looking forward to understanding ProxyFS better :smile:

edmc-ss commented 5 years ago

Hello again jnamdar... and I must applaud all the excellent questions/topics you are raising.

Before going further, I want to make sure you are aware of the Slace "group" we've got that might make interacting with other ProxyFS folks more responsive. The slack group is here and you can invite yourself here.

First off I want to address the only "interesting" line I saw in the proxyfs.log file you included: time="2019-06-21T13:42:21.965235+02:00" level=warning msg="Couldn't mount CommonVolume.FUSEMountPoint == CommonMountPoint" error="fusermount: exit status 1" function=performMount goroutine=8 package=fuse

What's going on here is that ProxyFS attempts to provide a FUSE mount point for the file system at the path your .conf file specified. As is the case with all mounts, the mount point must be a directory. Indeed, on many systems, the directory must be empty. If either the directory does not exist... or is not empty (on systems that require it to be so), you'll get this error message... It's not fatal (hence, the "level=warning"), but may not be what you desire.

A little background on how ProxyFS presents the file system via SMB vs NFS would be helpful here. Both smbd and nfsd are able to present a local file system to the network via those two protocols. For SMB, however, you probably noticed the "vfs" library (indeed, there are two: vfs, which is Samba-specific, and jrpcclient, which is generic and used by vfs). Using vfs (& jrpcclient), Samba is able to communicate directly (via a couple of TCP sockets per client) with the proxyfsd process. So, in the case of SMB/Samba, we are actually not presenting a file system that is locally available.

NFS is a different beast. While at one time, the ProxyFS plan was to leverage nfs-ganesha, a tool that does for NFS what Samba does for SMB. As it turns out, nfs-ganesha has an "FSAL" mechanism that enables one to "plug in" a file system just like Samba's "VFS" mechanism. Hence, there was this intention to code up an "fsal" library to plug into nfs-ganesha that would leverage that same jrpcclient library to communicate with proxyfsd. Alas, other priorities arose and the team never implemented an "fsal" library.

As a result, in order to present a file system via NFS, ProxyFS needed to expose the file system "locally". It does so via FUSE. As you can imagine, This whole SMB and NFS protocol thing is quite outside the purview of ProxyFS... so ProxyFS didn't want to "insist" on a FUSE exposure of the file system. As such, the FUSE mount step is allowed to fail since it isn't required (particularly in the SMB exposure case). It's very convenient, though, so even SwiftStack's Controller always ensures this directory exists (and is empty) so that the FUSE mount actually works. In any event, the /etc/exports is populated with a line (or more) to expose the FUSE mount point via NFS.

I'm speculating that your original attempt had skipped the step of creating the directory to which the FUSE mount point was attempted.

Now, here's something I do when ever I successfully mount... I "stat" the root of the mount. What I should see is that the Inode Number of the "root directory" of my mount is "1". If it's not "1", then you've probably not successfully mounted... or NFS is just exporting the directory and ProxyFS isn't presenting it. In any event, it's a handy quick check to make sure things are active.

I'm gonna close this response as you've posted a follow-up... so I'll respond to that one in a separate post.

edmc-ss commented 5 years ago

Hello again jnamdar,

Your speculation about the use of the "NoAuth" Swift Proxy versus the "normal" Swift Proxy is totally correct! Indeed, your question about "who provides authorization" is very key. As you can imagine, the world of SMB and NFS (distinctly different between them as well) is entirely different than for Swift & S3 API (also distinctly different between them as luck would have it). To put it vaguely, it is sometimes the protocol server and sometimes the file system and sometimes the Swift or S3 API pipelines that provide authorization. Let's talk about each:

SMB:

NFS:

Swift API:

S3 API:

As you can well imagine, the impedance mismatch between access control among these four protocols is tremendous. What we generally tell our customers is that they should apply per-protocol access control and not rely upon any reasonable "mapping" between them. In other words, don't expect a "chmod" to alter the access control for users coming in via the Swift or S3 API.

Hope the above makes some sense. I'll stop this post here to respond to your remaining topics in a subsequent post.

edmc-ss commented 5 years ago

Keystone Auth would be an awesome addition... though I don't understand how it might apply to NFS. As mentioned previously, the authorization strategy of NFSv3 is to entirely trust the "client". So let me just "punt" on that one.

As for SMB, the protocol is very rich as it turns out. With SMB, it's the Server (i.e. Samba in this case) that provides the necessary Auth. SMB supports something called SPNEGO... the term actually is an acronym of sorts for "Security Profile Negotiation". SPNEGO supports all kinds of "plug in" Auth mechanisms spanning from the very old Lan Manager ("LM") all the way up to Kerberos (as implemented by Active Directory btw). While this is way out of my skill set, it should be possible to provide an Auth plug-in for Keystone...and I'd be surprised if somebody hasn't done that already. To my understanding, Keystone is well within the capabilities of the SPNEGO Auth processing model. I would solicit help from the Samba folks on that one...

edmc-ss commented 5 years ago

Re the error you are getting from the "normal" Swift Proxy, I'm just curious what your "pfs" middleware filter section looks like. It should be something like this:

[filter:pfs]
use = egg:pfs_middleware#pfs
proxyfsd_host = 127.0.0.1
proxyfsd_port = 12345
bypass_mode = read-write

Notice that the port# is that same "12345" that is used by Samba's vfs/jrpcclient library. It uses the very same JSON-RPC mechanism to communicate with the ProxyFS instance (proxyfsd). The "host" should be any one of the PrivateIPAddr's that any of your ProxyFS instances are listening on. Indeed, it can be a list :-). What happens is that the pfs_middleware determines (from the Account's Header check) that the Account is being managed by ProxyFS and asks proxyfs_host:proxyfs_port which ProxyFS instance is actually managing the Account/Volume/FileSystem at this point in time. Kind of like a redirect.

What I suspect you may be seeing is that your "normal" Swift Proxy cannot communicate with the desired ProxyFS instance... not sure though.

edmc-ss commented 5 years ago

It's a bit unfortunate that your starting point is ProxyFS/saio (or the "runway" equivalent) as that setup is trivially straight forward. I'm suspecting that your cluster's topology is quite a bit more complicated than just "localhost" :-). Perhaps you could focus on the PrivateIPAddr discussion above and see about making connections.

jnamdar commented 5 years ago

I agree, the saio environment seems to be especially useful for test/development purposes, but it's the only thing I found to install ProxyFS. Is there any way I could check out another installation process that does not focus on hosting everything on a single machine?

Regards

edmc-ss commented 5 years ago

I've actually made a start on a branch called "ArmbianExample" that will ultimately network 3 little ODroid HC1's into a small Swift cluster... with ProxyFS running on it in an HA arrangement.

Beyond that, I guess I'd be remiss if I didn't plug SwiftStack's actual product. You can try that out and see what it does as well... It certainly "wizards" a lot of the setup we've been talking about. Happy to help you come up to speed on that product :-).