Azure / azure-storage-fuse

A virtual file system adapter for Azure Blob storage
Other
671 stars 208 forks source link

Blobfuse2 mount sftp for blob storage #1043

Closed jmjclaessen closed 1 year ago

jmjclaessen commented 1 year ago

Which version of blobfuse was used?

Blobfuse2 2.0.1

Which OS distribution and version are you using?

Red Hat Enterprise Linux 8.4, Linux 4.18.0-305.72.1.el8_4.x86_64

If relevant, please share your mount command.

blobfuse2 mount /blob/sftp/carlo --config-file=/blob/config/config.yaml

What was the issue encountered?

Customers are using sftp for blobstorage to put files on blobstorage. We want to map this blobstorage so we can process the files on it using blobfuse2. Attached the config.yaml file. sftp for blobstorage uses an Azure Datalake V2 account. If i map the drive using account type block drive is mapped but i cannot see the folder stucture. If i only change the type to adls (because this is a hierarchical namespace) i get an authentication error: LOG_ERR [mount.go (368)]: mount : failed to initialize new pipeline [failed to authenticate credentials for azstorage. Only change made is the change from type : block to type: adls

Have you found a mitigation/solution?

No

Please share logs if available.

config yaml.txt blobfuse2 mount error.txt blobfuse command.txt

vibhansa-msft commented 1 year ago

You config looks fine to me. You are using key as the credential to authenticate with azure -storage which shall be fine. If it's a HNS account add "type: adls" to "azstorage" section in your config file. Also, you can skip defining "endpoint" if storage account is not behind a private endpoint. As per error logs, its saying "you are not authorized to perform this operation". Kindly check your credentials and if possible validate the connection is going through without blobfuse, like using azcli or some other means just to validate authenticate is pass.

jmjclaessen commented 1 year ago

Removed endpoint. Attached the 2 config files, if i use type: adsl : Error: failed to initialize new pipeline [failed to authenticate credentials for azstorage] if i just and only replace type to type: block mount succeeds. So both files have the same account-name and account-key. Because the account-name and account key work with type: block there is no issue with the accountname or accountkey i think. Sftp server for blobstorage is using a private endpoint for internal connections and allowed IP addresses in the build-in storage account firewall for internet (external connections). blobfuse block working.txt blobfuse2 adls not working.txt

vibhansa-msft commented 1 year ago

If you are using private endpoints to connect to storage account, ensure both dfs and blob endpoints are exposed over private endpoints. Even in case of HNS accounts, uploads/downloads always go over blob endpoint from blobfuse side to ensure cross protocol compatibility. Mostly your dfs endpoint might not be available over private endpoint.

vibhansa-msft commented 1 year ago

Also, if you are using private endpoints then you need to specify the endpoint clearly in your config.

jmjclaessen commented 1 year ago

New config file dfs.txt I created an entry called sftpclsp00013520001.privatelink.dfs.core.windows.net in private dns, added this to the endpoint in the config file. sftpclsp00013520001.privatelink.dfs.core.windows.net resolves correctly to the internal ip address, a wget for https://sftpclsp00013520001.privatelink.dfs.core.windows.net work i get a connection on port 443. When i do a mapping it takes a long time but in the end i get the same error: Error: failed to initialize new pipeline [failed to authenticate credentials for azstorage]. According to https://learn.microsoft.com/en-us/azure/data-explorer/kusto/api/connection-strings/storage-connection-strings should be https://StorageAccountName.dfs.core.windows.net/Filesystem[/PathToDirectoryOrFile][CallerCredentials], but i have containers and according to azure they can be accessed using https://sftpclsp00013520001.blob.core.windows.net/carlo

vibhansa-msft commented 1 year ago

Can you try wget on blob endpoint as well, as blobfuse2 will try on both dfs and blob endpoint. If either of them fails, mount will fail. If you observe blob endpoint works with wget then share the debug logs again with corrected config (set to dfs pvt-endpoint).

jmjclaessen commented 1 year ago

wget to https://sftpclsp00013520001.privatelink.dfs.core.windows.net, wget to https://sftpclsp00013520001.privatelink.blob.core.windows.net/ both work blobfuse2.log blobfuse2-rest.log

vibhansa-msft commented 1 year ago

Jan 30 04:01:46 vm-0001352-0007 blobfuse2[20943]: 2023/01/30 04:01:46 ==> REQUEST/RESPONSE (Try=5/122.48753ms, OpTime=1m47.831912613s) -- REQUEST ERROR#012 GET https://sftpclsp00013520001.privatelink.dfs.core.windows.net/carlo?maxResults=2&recursive=false&resource=filesystem&timeout=901#012 Authorization: REDACTED#012 User-Agent: [Azure-Storage-Fuse/2.0.1 (Red Hat Enterprise Linux 8.4 (Ootpa)) Azure-Storage/0.1 (go1.16.2; linux)]#012 X-Ms-Client-Request-Id: [9fdaee59-30f4-4ded-700d-7b5e3a560ed4]#012 X-Ms-Version: [2018-11-09]#012 x-ms-date: [Mon, 30 Jan 2023 11:01:46 GMT]#012 --------------------------------------------------------------------------------#012 ERROR:#012-> github.com/Azure/azure-pipeline-go/pipeline.NewError, /home/cloudtest/go/pkg/mod/github.com/!azure/azure-pipeline-go@v0.2.3/pipeline/error.go:157#012HTTP request failed#012#012Get "https://sftpclsp00013520001.privatelink.dfs.core.windows.net/carlo?maxResults=2&recursive=false&resource=filesystem&timeout=901": *x509: certificate is valid for .blob.core.windows.net ,.mwh20prdstr01a.store.core.windows.net, .blob.storage.azure.net, .z1.blob.storage.azure.net, .z2.blob.storage.azure.net, .z3.blob.storage.azure.net, .z4.blob.storage.azure.net, .z5.blob.storage.azure.net, .z6.blob.storage.azure.net, .z7.blob.storage.azure.net, .z8.blob.storage.azure.net, .z9.blob.storage.azure.net, .z10.blob.storage.azure.net, .z11.blob.storage.azure.net, .z12.blob.storage.azure.net, .z13.blob.storage.azure.net, .z14.blob.storage.azure.net, .z15.blob.storage.azure.net, .z16.blob.storage.azure.net, .z17.blob.storage.azure.net, .z18.blob.storage.azure.net, .z19.blob.storage.azure.net, .z20.blob.storage.azure.net, .z21.blob.storage.azure.net, .z22.blob.storage.azure.net, .z23.blob.storage.azure.net, .z24.blob.storage.azure.net, .z25.blob.storage.azure.net, .z26.blob.storage.azure.net, .z27.blob.storage.azure.net, .z28.blob.storage.azure.net, .z29.blob.storage.azure.net, .z30.blob.storage.azure.net, .z31.blob.storage.azure.net, .z32.blob.storage.azure.net, .z33.blob.storage.azure.net, .z34.blob.storage.azure.net, .z35.blob.storage.azure.net, .z36.blob.storage.azure.net, .z37.blob.storage.azure.net, .z38.blob.storage.azure.net, .z39.blob.storage.azure.net, .z40.blob.storage.azure.net, .z41.blob.storage.azure.net, .z42.blob.storage.azure.net, .z43.blob.storage.azure.net, .z44.blob.storage.azure.net, .z45.blob.storage.azure.net, .z46.blob.storage.azure.net, .z47.blob.storage.azure.net, .z48.blob.storage.azure.net, .z49.blob.storage.azure.net, .z50.blob.storage.azure.net, not sftpclsp00013520001.privatelink.dfs.core.windows.net**

vibhansa-msft commented 1 year ago

This appears to be some issue with the SSL handshake as the certificate is not valid for the DFS endpoint.

vibhansa-msft commented 1 year ago

As this account is behind a private endpoint, your authentication is happening through some private AAD as well? Just check the configurations there as this mostly is having some auth issue here.

jmjclaessen commented 1 year ago

Did some testing if i change the config.yaml file and enter type: block and endpoint: https://sftpclsp00013520001.blob.core.windows.net/ it works, do i enter type: block and endpoint: https://sftpclsp00013520001.privatelink.blob.core.windows.net/ i get Failed to validate storage account [failed to authenticate credentials for azstorage]. So for adls it probably would be type=adls and endpoint: https://sftpclsp00013520001.dfs.core.windows.net/? But then i get : Failed to validate storage account [failed to authenticate credentials for azstorage] so i did some nslookups on the server, see attachment. results. It looks like the dfs entry is resolved to 20.60.153.226 and the blob entry to an internal address 10.142.148.10. Public access is disabled. I can create an alias in privatelink.dfs.core.windows.net but not dfs.core.windows.net. Is this something i need to address to Azure support? [root@vm-0001352-0007 config]# nslookup sftpclsp00013520001.dfs.core.windows.net Server: 10.142.4.10 Address: 10.142.4.10#53

Non-authoritative answer: sftpclsp00013520001.dfs.core.windows.net canonical name = dfs.mwh20prdstr01a.store.core.windows.net. Name: dfs.mwh20prdstr01a.store.core.windows.net Address: 20.60.153.226

[root@vm-0001352-0007 config]# nslookup sftpclsp00013520001.blob.core.windows.net Server: 10.142.4.10 Address: 10.142.4.10#53

Non-authoritative answer: sftpclsp00013520001.blob.core.windows.net canonical name = sftpclsp00013520001.privatelink.blob.core.windows.net. Name: sftpclsp00013520001.privatelink.blob.core.windows.net Address: 10.142.148.10

vibhansa-msft commented 1 year ago

@jmjclaessen: you shall have correct resolutions in place for mount to work properly. You can reach out to network team to understand what part you are missing here. If your account is HNS "type: adls" and "endpoint: .dfs." shall be the configuration to use. However based on my earlier error lookup I still doubt there is some authentication related issue as well. Error was clearly saying certificate is not valid for the endpoint you are specifying. If its private endpoint backed account then ideally .blob.core.windows.net shall not work unless you have some sort of DNS resolution which leads to the actual private endpoint backed account.

vibhansa-msft commented 1 year ago

Can you go through this document describing about private endpoints and validate setup is correct. https://learn.microsoft.com/en-us/azure/storage/common/storage-private-endpoints

jmjclaessen commented 1 year ago

Creating an additional private endpoint for sftpclsp00013520001.dfs.core.windows.net, as suggested by Azure support, solved the issue drive is now mounted. Issue can be closed.

vibhansa-msft commented 1 year ago

Thanks great @jmjclaessen. Can you share here how did you create the dfs private-endpoint. It might help other users having similar issue.

jmjclaessen commented 1 year ago

Creating 2nd private endpoint for dfs in Azure portal: Go to your storageaccount -> Networking -> Private Endpoint connections. Click + Private endpoint, fill in subscription, resourcegroup, Name, Network Interface Name and Region. Click next and under Target sub-resource select dfs. Click Virtual network and select virtual netork and Subnet. Click DNS. Select Yes for Integrate with private DNS. Select the subscription and resourcegroup for your private link DNS. Select Next, Next and select create

jmjclaessen commented 1 year ago

Vikas,

Added instructions to create dfs private endpoint.

Regards,

John Claessen

Cloud Migration Office (part of the Cloud Center of Excellence)

NXP Semiconductors

High Tech Campus 60

5656 AG Eindhoven

The Netherlands

Mobile: +31 6 1212 8822

Email: @.***

From: Vikas Bhansali @.> Sent: Thursday, February 2, 2023 10:10 AM To: Azure/azure-storage-fuse @.> Cc: John Claessen @.>; Mention @.> Subject: [EXT] Re: [Azure/azure-storage-fuse] Blobfuse2 mount sftp for blob storage (Issue #1043)

Caution: EXT Email

Thanks great @jmjclaessen https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.co m%2Fjmjclaessen&data=05%7C01%7Cjohn.claessen%40nxp.com%7Cde809f887bda421a5ba 508db04fd487a%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C63810925810804927 4%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haW wiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HxB784C%2BN7puuGK0akrbvkDqoIxIufhCUI5 mPDxMXvs%3D&reserved=0 . Can you share here how did you create the dfs private-endpoint. It might help other users having similar issue.

- Reply to this email directly, view it on GitHub https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.co m%2FAzure%2Fazure-storage-fuse%2Fissues%2F1043%23issuecomment-1413382344&dat a=05%7C01%7Cjohn.claessen%40nxp.com%7Cde809f887bda421a5ba508db04fd487a%7C686 ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C638109258108049274%7CUnknown%7CTWFpb GZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C 3000%7C%7C%7C&sdata=IMTw3KmAyT5nn%2FGu1Bb%2F7XRE2zT3rEesiUDUBvsO7ik%3D&reser ved=0 , or unsubscribe https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.co m%2Fnotifications%2Funsubscribe-auth%2FA5N7BQOVXFYANGVNFQ76GZ3WVN2XBANCNFSM6 AAAAAAUIUYHVE&data=05%7C01%7Cjohn.claessen%40nxp.com%7Cde809f887bda421a5ba50 8db04fd487a%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C638109258108049274% 7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwi LCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=pxft8Z%2BiUSWJ48OMSklZHSx%2B8SedP1pGFas GvSL4T7M%3D&reserved=0 . You are receiving this because you were mentioned. https://github.com/notifications/beacon/A5N7BQOUKOD4U27J4DLICLLWVN2XBA5CNFS M6AAAAAAUIUYHVGWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTS UH2AMQ.gif Message ID: @. @.> >

vibhansa-msft commented 1 year ago

@jmjclaessen : Thanks for sharing this. I will put this info in our wiki related to TSG, Appreciate your help. " *x509: certificate is valid for .blob.core.windows.net" this error was also resolved with this or there was some other reason for this error.

jmjclaessen commented 1 year ago

Vikas,

X-509 error was also caused by the fact that the dfs entry couldn't be reached.

Regards,

John Claessen

Cloud Migration Office (part of the Cloud Center of Excellence)

NXP Semiconductors

High Tech Campus 60

5656 AG Eindhoven

The Netherlands

Mobile: +31 6 1212 8822

Email: @.***

From: Vikas Bhansali @.> Sent: Thursday, February 2, 2023 12:13 PM To: Azure/azure-storage-fuse @.> Cc: John Claessen @.>; Mention @.> Subject: [EXT] Re: [Azure/azure-storage-fuse] Blobfuse2 mount sftp for blob storage (Issue #1043)

Caution: EXT Email

@jmjclaessen https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.co m%2Fjmjclaessen&data=05%7C01%7Cjohn.claessen%40nxp.com%7C07d6b5d7b8bb4efb85d d08db050e7aa8%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C63810933196281697 7%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haW wiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=FOlAn%2FmDsvtDjrxwP4mkh2Gq1z1lOcwzvus WE5Cg38U%3D&reserved=0 : Thanks for sharing this. I will put this info in our wiki related to TSG, Appreciate your help. " *x509: certificate is valid for .blob.core.windows.net" this error was also resolved with this or there was some other reason for this error.

- Reply to this email directly, view it on GitHub https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.co m%2FAzure%2Fazure-storage-fuse%2Fissues%2F1043%23issuecomment-1413569300&dat a=05%7C01%7Cjohn.claessen%40nxp.com%7C07d6b5d7b8bb4efb85dd08db050e7aa8%7C686 ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C638109331962816977%7CUnknown%7CTWFpb GZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C 3000%7C%7C%7C&sdata=4muZ3nEelfEuAmz5vjKaPyNm3hsuXRcvBlcych2mYCY%3D&reserved= 0 , or unsubscribe https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.co m%2Fnotifications%2Funsubscribe-auth%2FA5N7BQIX3B6UGDXXPSIYOYTWVOJEVANCNFSM6 AAAAAAUIUYHVE&data=05%7C01%7Cjohn.claessen%40nxp.com%7C07d6b5d7b8bb4efb85dd0 8db050e7aa8%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C638109331962816977% 7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwi LCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=GqbzWzS3YSsCIlHMDs59tfSG8r4g2kFuiOtPAR3 q8nE%3D&reserved=0 . You are receiving this because you were mentioned. https://github.com/notifications/beacon/A5N7BQJ35VQJWMCWPAWRTFDWVOJEVA5CNFS M6AAAAAAUIUYHVGWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTS UIFNRI.gif Message ID: @. @.> >