Closed martintoreilly closed 3 years ago
@DavidBeavan @jamespjh Can the Safe Haven environment and data for Living with Machines be hosted in the US?
@martintoreilly Decide what azure region to use 487 on LwM GitHub says 'UK South should be our default region of choice'
The personal info we're dealing with (there will be some living people in the data if >100 yrs old) then is there a benefit from being within the EU/UK?
There’s mild risk here if Privacy Shield falls, but otherwise it’s fine, I think.
Can we lobby Kenji to get the relevant features deployed to the UK instance, especially now MSR staff are going to be contributing to the project?
I'm afraid I don't know about the EU/UK vs US situation with any certainty. I did some brief googling on GDPR requirements.
@DavidBeavan Considering the other potential options for getting Azure Storage working inside a safe haven now, what value does the project place on having this capability, considering both the value of the pure storage and the fact it is a critical enabler to support HDInsight clusters like Spark?
The base cost of a safe haven secure environment is currently around $500 / month + any non-trivial compute (though we hope to lower this to around $250 in future). The least complex Azure Firewall option will add an additional $900 / month to this bill. The more complex SQUID proxy option will add around $120 / month to this bill. How acceptable are either of these costs and, if they could be supported in the short term while we wait for availability of Virtual Network Service Endpoint Policies in Europe / UK, how long could these additional costs be supported?
There’s mild risk here if Privacy Shield falls, but otherwise it’s fine, I think. Can we lobby Kenji to get the relevant features deployed to the UK instance, especially now MSR staff are going to be contributing to the project?
@ktakeda1 @trallard Do you have any insight into when we can expect Virtual Network Service Endpoint Policies to (i) be available in Europe / UK regions and (ii) be generally available rather than in preview? Can either of you apply any pressure within Microsoft to bring the timeline for Europe / UK availability forward?
I'm scoping this out with @claireaustin01 of LwM, she knows the content of various data agreements with providers and if they comment on geographic location.
More green light, @claireaustin01 of LwM says:
So far to the best of my knowledge there has been no mention of geography in agreements but that’s not to say there won’t be.
Do you have any insight into when we can expect Virtual Network Service Endpoint Policies to (i) be available in Europe / UK regions and (ii) be generally available rather than in preview? Can either of you apply any pressure within Microsoft to bring the timeline for Europe / UK availability forward?
I can check directly with the product team about the ETA, not sure if the date can be moved forward tbh as it all depends on the general roadmap the team has planned for the upcoming realeases
Look at mounting Azure file storage over SMB with AAD authentication. Also look at Blob FUSE.
Do you have any insight into when we can expect Virtual Network Service Endpoint Policies to (i) be available in Europe / UK regions and (ii) be generally available rather than in preview? Can either of you apply any pressure within Microsoft to bring the timeline for Europe / UK availability forward?
I can check directly with the product team about the ETA, not sure if the date can be moved forward tbh as it all depends on the general roadmap the team has planned for the upcoming realeases
If you could check on the ETA for preview + general availability in European and UK regions, that would be great, thanks.
From @martintoreilly in #407
For full read-write, I think we should probably be looking at mounting Azure file storage with SMB + Kerberos + Azure Active Directory Domain Services.
Currently Azure File storage is much slower than blob storage for some common access patterns, but (i) premium file storage is available (I assume at a correspondingly higher cost) and (ii) it looks like the baseline performance is significantly improving (see File storage scale targets)
From here it looks as though you can now use private endpoints to get a private IP address for the Azure storage account.
Private Endpoint for within-VNET Azure SQL server exists within the default subnet for VNET, along with the within-VNET Azure VM.
Access permitted from (i) laptop at Martin's home (ii) within-VNET Azure VM (iii) outside-VNET Azure VM. No private endpoint, so no access permitted from VNET except via internet.
No allowed IP addresses, so no inbound connections from internet permitted. Private endpoint exists to allow access from VNET.
atiadmin@paas-in-vnet-uksouth-vm:~$ echo "NSG rule ALLOWS Internet Out"
NSG rule ALLOWS Internet Out
atiadmin@paas-in-vnet-uksouth-vm:~$ ping -c2 google.com
PING google.com (216.58.211.174) 56(84) bytes of data.
64 bytes from dub08s01-in-f174.1e100.net (216.58.211.174): icmp_seq=1 ttl=53 time=2.52 ms
64 bytes from dub08s01-in-f174.1e100.net (216.58.211.174): icmp_seq=2 ttl=53 time=2.50 ms
--- google.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 2.507/2.516/2.525/0.009 ms
atiadmin@paas-in-vnet-uksouth-vm:~$ mssql-cli -S paas-in-vnet-test-westeurope-not-in-vnet-dbserver.database.windows.net -d paas-in-vnet-test-westeurope-not-in-vnet-db -U mtadmin -P <password-redacted>
paas-in-vnet-test-westeurope-not-in-vnet-db> SELECT TOP 5 * FROM SYSOBJECTS WHERE xtype = 'U'
Time: 0.704s
+--------------------------------+------------+---------+-------+--------+----------+---------------
| name | id | xtype | uid | info | status | base_schema_ve
|--------------------------------+------------+---------+-------+--------+----------+---------------
| Customer | 1253579504 | U | 5 | 0 | 0 | 0
| ProductModel | 1301579675 | U | 5 | 0 | 0 | 0
| ProductDescription | 1349579846 | U | 5 | 0 | 0 | 0
| Product | 1381579960 | U | 5 | 0 | 0 | 0
| ProductModelProductDescription | 1413580074 | U | 5 | 0 | 0 | 0
+--------------------------------+------------+---------+-------+--------+----------+---------------
(5 rows affected)
paas-in-vnet-test-westeurope-not-in-vnet-db> exit
atiadmin@paas-in-vnet-uksouth-vm:~$ mssql-cli -S paas-in-vnet-test-westeurope-dbserver.database.windows.net -d paas-in-vnet-test-westeurope-db -U mtadmin -P <password-redacted>
paas-in-vnet-test-westeurope-db> SELECT TOP 5 * FROM SYSOBJECTS WHERE xtype = 'U'
Time: 0.854s
+--------------------------------+------------+---------+-------+--------+----------+---------------
| name | id | xtype | uid | info | status | base_schema_ve
|--------------------------------+------------+---------+-------+--------+----------+---------------
| Customer | 1253579504 | U | 5 | 0 | 0 | 0
| ProductModel | 1301579675 | U | 5 | 0 | 0 | 0
| ProductDescription | 1349579846 | U | 5 | 0 | 0 | 0
| Product | 1381579960 | U | 5 | 0 | 0 | 0
| ProductModelProductDescription | 1413580074 | U | 5 | 0 | 0 | 0
+--------------------------------+------------+---------+-------+--------+----------+---------------
(5 rows affected)
paas-in-vnet-test-westeurope-db> exit
atiadmin@paas-in-vnet-uksouth-vm:~$
atiadmin@paas-in-vnet-uksouth-vm:~$ echo "NSG rule DENIES Internet Out"
NSG rule DENIES Internet Out
atiadmin@paas-in-vnet-uksouth-vm:~$ ping -c2 google.com
PING google.com (216.58.210.238) 56(84) bytes of data.
--- google.com ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1014ms
atiadmin@paas-in-vnet-uksouth-vm:~$ mssql-cli -S paas-in-vnet-test-westeurope-not-in-vnet-dbserver.database.windows.net -d paas-in-vnet-test-westeurope-not-in-vnet-db -U mtadmin -P <password-redacted>
Error message: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: TCP Provider, error: 40 - Could not open a connection to SQL Server)
atiadmin@paas-in-vnet-uksouth-vm:~$ mssql-cli -S paas-in-vnet-test-westeurope-dbserver.database.windows.net -d paas-in-vnet-test-westeurope-db -U mtadmin -P <password-redacted>
paas-in-vnet-test-westeurope-db> SELECT TOP 5 * FROM SYSOBJECTS WHERE xtype = 'U'
Time: 1.355s (a second)
+--------------------------------+------------+---------+-------+--------+----------+-----------------
| name | id | xtype | uid | info | status | base_schema_ver
|--------------------------------+------------+---------+-------+--------+----------+-----------------
| Customer | 1253579504 | U | 5 | 0 | 0 | 0
| ProductModel | 1301579675 | U | 5 | 0 | 0 | 0
| ProductDescription | 1349579846 | U | 5 | 0 | 0 | 0
| Product | 1381579960 | U | 5 | 0 | 0 | 0
| ProductModelProductDescription | 1413580074 | U | 5 | 0 | 0 | 0
+--------------------------------+------------+---------+-------+--------+----------+-----------------
(5 rows affected)
paas-in-vnet-test-westeurope-db> exit
atiadmin@paas-in-vnet-uksouth-vm:~$
MAC-ATI0132:~ moreilly$ mssql-cli -S paas-in-vnet-test-westeurope-not-in-vnet-dbserver.database.windows.net -d paas-in-vnet-test-westeurope-not-in-vnet-db -U mtadmin -P <password-redacted>
paas-in-vnet-test-westeurope-not-in-vnet-db> SELECT TOP 5 * FROM SYSOBJECTS WHERE xtype = 'U'
Time: 0.744s
+--------------------------------+------------+---------+-------+--------+----------+-----------------
| name | id | xtype | uid | info | status | base_schema_ver
|--------------------------------+------------+---------+-------+--------+----------+-----------------
| Customer | 1253579504 | U | 5 | 0 | 0 | 0
| ProductModel | 1301579675 | U | 5 | 0 | 0 | 0
| ProductDescription | 1349579846 | U | 5 | 0 | 0 | 0
| Product | 1381579960 | U | 5 | 0 | 0 | 0
| ProductModelProductDescription | 1413580074 | U | 5 | 0 | 0 | 0
+--------------------------------+------------+---------+-------+--------+----------+-----------------
(5 rows affected)
paas-in-vnet-test-westeurope-not-in-vnet-db> exit
MAC-ATI0132:~ moreilly$ mssql-cli -S paas-in-vnet-test-westeurope-dbserver.database.windows.net -d paas-in-vnet-test-westeurope-db -U mtadmin -P <password-redacted>
Error message: Cannot open server 'paas-in-vnet-test-westeurope-dbserver' requested by the login. Client with IP address '90.240.138.110' is not allowed to access the server. To enable access, use the Windows Azure Management Portal or run sp_set_firewall_rule on the master database to create a firewall rule for this IP address or address range. It may take up to five minutes for this change to take effect.
MAC-ATI0132:~ moreilly$
atiadmin@paas-in-vnet-test-uk-south-open-vm:~$ mssql-cli -S paas-in-vnet-test-westeurope-not-in-vnet-dbserver.database.windows.net -d paas-in-vnet-test-westeurope-not-in-vnet-db -U mtadmin -P <password-redacted>
paas-in-vnet-test-westeurope-not-in-vnet-db> SELECT TOP 5 * FROM SYSOBJECTS WHERE xtype = 'U'
Time: 2.177s (2 seconds)
+--------------------------------+------------+---------+-------+--------+----------+-----------------
| name | id | xtype | uid | info | status | base_schema_ver
|--------------------------------+------------+---------+-------+--------+----------+-----------------
| Customer | 1253579504 | U | 5 | 0 | 0 | 0
| ProductModel | 1301579675 | U | 5 | 0 | 0 | 0
| ProductDescription | 1349579846 | U | 5 | 0 | 0 | 0
| Product | 1381579960 | U | 5 | 0 | 0 | 0
| ProductModelProductDescription | 1413580074 | U | 5 | 0 | 0 | 0
+--------------------------------+------------+---------+-------+--------+----------+-----------------
(5 rows affected)
paas-in-vnet-test-westeurope-not-in-vnet-db> exit
atiadmin@paas-in-vnet-test-uk-south-open-vm:~$ mssql-cli -S paas-in-vnet-test-westeurope-dbserver.database.windows.net -d paas-in-vnet-test-westeurope-db -U mtadmin -P <password-redacted>
Error message: Cannot open server 'paas-in-vnet-test-westeurope-dbserver' requested by the login. Client with IP address '13.73.148.56' is not allowed to access the server. To enable access, use the Windows Azure Management Portal or run sp_set_firewall_rule on the master database to create a firewall rule for this IP address or address range. It may take up to five minutes for this change to take effect.
atiadmin@paas-in-vnet-test-uk-south-open-vm:~$
👆 @jemrobinson @ens-brett-todd I have tested the above configuration for restricting access to an Azure SQL PaaS to be only from VMs within a particular VNET Subnet via a Private Endpoint 👆
@thobson88 Is checking that this approach also works for Azure Storage (I expect it will as private endpoints are also a configuration option for Azure Storage).
i.e. following the same steps as above, but this time for storage accounts.
Private Endpoint for within-VNET Azure storage account exists within the default subnet for VNET, along with the within-VNET Azure VM.
Access permitted from (i) within-VNET Azure VM (ii) outside-VNET Azure VM (iii) laptop at Tim's home. No private endpoint, so no access permitted from VNET except via internet.
Note: Unlike for the SQL server case above, these rules can't be assigned names. Also, the IP addresses of the VNET VMs have changed as they're assigned dynamically when restarted.
No allowed IP addresses, so no inbound connections from internet permitted. Private endpoint exists to allow access from VNET.
Note: Private endpoints to storage accounts are sub-resource specific (blob, file, table or queue). This one is for blob storage.
paas-in-vnet-uksouth-vm
to outside-vnet storage account paasinvnettestwesaoutvn
:
thobson@paas-in-vnet-uksouth-vm:~$ echo "NSG rule ALLOWS Internet Out"
NSG rule ALLOWS Internet Out
thobson@paas-in-vnet-uksouth-vm:~$ ping -c2 google.com
PING google.com (216.58.213.14) 56(84) bytes of data.
64 bytes from lhr25s25-in-f14.1e100.net (216.58.213.14): icmp_seq=1 ttl=53 time=1.71 ms
64 bytes from lhr25s25-in-f14.1e100.net (216.58.213.14): icmp_seq=2 ttl=53 time=1.92 ms
--- google.com ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1000ms rtt min/avg/max/mdev = 1.713/1.820/1.927/0.107 ms
thobson@paas-in-vnet-uksouth-vm:~$ nslookup paasinvnettestwesaoutvn.blob.core.windows.net Server: 127.0.0.53 Address: 127.0.0.53#53
Non-authoritative answer: paasinvnettestwesaoutvn.blob.core.windows.net canonical name = blob.am5prdstr05a.store.core.windows.net. Name: blob.am5prdstr05a.store.core.windows.net Address: 40.68.176.16
Download & install azcopy:
thobson@paas-in-vnet-uksouth-vm:~$ wget https://aka.ms/downloadazcopy-v10-linux thobson@paas-in-vnet-uksouth-vm:~$ tar xvzf downloadazcopy-v10-linux thobson@paas-in-vnet-uksouth-vm:~$ alias azcopy='azcopy_linux_amd64_10.3.4/azcopy'
Login and download a file (previously uploaded via the portal):
thobson@paas-in-vnet-uksouth-vm:~$ azcopy login
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code
Job 807b1477-3514-bc41-71b6-a6506406fa35 has started Log file is located at: /home/thobson/.azcopy/807b1477-3514-bc41-71b6-a6506406fa35.log
0.0 %, 0 Done, 0 Failed, 1 Pending, 0 Skipped, 1 Total,
Job 807b1477-3514-bc41-71b6-a6506406fa35 summary Elapsed Time (Minutes): 0.0334 Total Number Of Transfers: 1 Number of Transfers Completed: 1 Number of Transfers Failed: 0 Number of Transfers Skipped: 0 TotalBytesTransferred: 128964 Final Job Status: Completed
thobson@paas-in-vnet-uksouth-vm:~$ head -n 2 01_metadata.csv
COUNTY,SHEET_MAP,SHEET_NO,SHEET,IMAGETHUMB,IMAGEURL,DATES,IMAGE
London,XV.94,"015.94",London XV.94,https://deriv.nls.uk/dcn4/1012/0287/101202873.4.jpg,https://maps.nls.uk/view/101202873,Revised: ca. 1894 - 1895
Published: 1896,"101202873"
- Test connection from *inside-vnet with outbound internet access ALLOWED* VM `paas-in-vnet-uksouth-vm` to *inside-vnet* blob storage `paasinvnettestwesainvn`:
thobson@paas-in-vnet-uksouth-vm:~$ nslookup paasinvnettestwesainvn.blob.core.windows.net Server: 127.0.0.53 Address: 127.0.0.53#53
Non-authoritative answer: paasinvnettestwesainvn.blob.core.windows.net canonical name = paasinvnettestwesainvn.privatelink.blob.core.windows.net. Name: paasinvnettestwesainvn.privatelink.blob.core.windows.net Address: 10.2.0.6
- Upload a file (portal upload not possible given NSG rule):
thobson@paas-in-vnet-uksouth-vm:~$ azcopy cp '/home/thobson/01_metadata.csv' 'https://paasinvnettestwesainvn.blob.core.windows.net/sa-bl ob-public-access-test/01_metadata.csv' INFO: Scanning... INFO: Using OAuth token for authentication.
Job b9eb3e9a-3027-9242-589b-312bbfb84bab has started Log file is located at: /home/thobson/.azcopy/b9eb3e9a-3027-9242-589b-312bbfb84bab.log
INFO: Authentication failed, it is either not correct, or expired, or does not have the correct permission -> github.com/Azure/azure-storage-blob-go/azblob.newStorageError, /home/vsts/go/pkg/mod/github.com/!azure/azure-storage-blob-go@v0.7.0/azblob/zc_storage_error.go:42 ===== RESPONSE ERROR (ServiceCode=AuthorizationPermissionMismatch) ===== Description=This request is not authorized to perform this operation using this permission. RequestId:32df7432-501e-005d-3957-0cc8a7000000 ... Job b9eb3e9a-3027-9242-589b-312bbfb84bab summary Elapsed Time (Minutes): 0.0333 Total Number Of Transfers: 1 Number of Transfers Completed: 0 Number of Transfers Failed: 1 Number of Transfers Skipped: 0 TotalBytesTransferred: 0 Final Job Status: Cancelled
Re-try with a SAS token:
thobson@paas-in-vnet-uksouth-vm:~$ azcopy cp '/home/thobson/01_metadata.csv' 'https://paasinvnettestwesainvn.blob.core.windows.net/sa-blob-public-access-test
Job 86bf834a-55ea-b74c-79db-5733a1c4eb4d has started Log file is located at: /home/thobson/.azcopy/86bf834a-55ea-b74c-79db-5733a1c4eb4d.log
0.0 %, 0 Done, 0 Failed, 1 Pending, 0 Skipped, 1 Total,
Job 86bf834a-55ea-b74c-79db-5733a1c4eb4d summary Elapsed Time (Minutes): 0.0334 Total Number Of Transfers: 1 Number of Transfers Completed: 1 Number of Transfers Failed: 0 Number of Transfers Skipped: 0 TotalBytesTransferred: 128964 Final Job Status: Completed
- List files:
thobson@paas-in-vnet-uksouth-vm:~$ azcopy list 'https://paasinvnettestwesainvn.blob.core.windows.net/sa-blob-public-access-test' INFO: List is using OAuth token for authentication.
failed to traverse container: cannot list blobs. Failed with error -> github.com/Azure/azure-storage-blob-go/azblob.newStorageError, /home/vsts/go/pkg/mod/github.com/!azure/azure-storage-blob-go@v0.7.0/azblob/zc_storage_error.go:42 ===== RESPONSE ERROR (ServiceCode=AuthorizationPermissionMismatch) ===== Description=This request is not authorized to perform this operation using this permission.
thobson@paas-in-vnet-uksouth-vm:~$ azcopy list 'https://paasinvnettestwesainvn.blob.core.windows.net/sa-blob-public-access-test
## Test from VNET VM with outbound internet access DENIED
### VM network rules
<img width="1403" alt="Screenshot 2020-04-06 at 22 48 49" src="https://user-images.githubusercontent.com/26117394/78608437-d721e780-7858-11ea-9ea8-25b63bb073ad.png">
<img width="1407" alt="Screenshot 2020-04-06 at 22 50 26" src="https://user-images.githubusercontent.com/26117394/78608542-089ab300-7859-11ea-818a-25b0934778b6.png">
### VM storage account connectivity tests
thobson@paas-in-vnet-uksouth-vm:~$ echo "NSG rule DENIES Internet Out" NSG rule DENIES Internet Out thobson@paas-in-vnet-uksouth-vm:~$ ping -c2 google.com PING google.com (216.58.204.14) 56(84) bytes of data.
--- google.com ping statistics --- 2 packets transmitted, 0 received, 100% packet loss, time 1025ms
thobson@paas-in-vnet-uksouth-vm:~$ nslookup paasinvnettestwesaoutvn.blob.core.windows.net Server: 127.0.0.53 Address: 127.0.0.53#53
Non-authoritative answer: paasinvnettestwesaoutvn.blob.core.windows.net canonical name = blob.am5prdstr05a.store.core.windows.net. Name: blob.am5prdstr05a.store.core.windows.net Address: 40.68.176.16
thobson@paas-in-vnet-uksouth-vm:~$ rm 01_metadata.csv
thobson@paas-in-vnet-uksouth-vm:~$ azcopy login
Failed to perform login command: failed to login with tenantID "common", Azure directory endpoint "https://login.microsoftonline.com", autorest/adal/devicetoken: -REDACTED- occurred while sending request for Device Authorization Code: Post https://login.microsoftonline.com/common/oauth2/devicecode?api-version=1.0: dial tcp 40.126.1.142:443: i/o timeout
NOTE: If your credential was created in the last 5 minutes, please wait a few minutes and try again.
thobson@paas-in-vnet-uksouth-vm:~$ azcopy cp 'https://paasinvnettestwesaoutvn.blob.core.windows.net/sa-blob-public-access-test/01_metadata.csv' '/home/thobson' INFO: Scanning... ^C thobson@paas-in-vnet-uksouth-vm:~$ ls azcopy_linux_amd64_10.3.4 downloadazcopy-v10-linux
- Test connection from *inside-vnet with outbound internet access DENIED* VM `paas-in-vnet-uksouth-vm` to *inside-vnet* blob storage `paasinvnettestwesainvn`:
Download the test file (using the SAS token):
thobson@paas-in-vnet-uksouth-vm:~$ azcopy cp 'https://paasinvnettestwesainvn.blob.core.windows.net/sa-blob-public-access-test/01_metadata.csv
Job 681c5d69-bacf-1c44-7ea2-9853f73eea67 has started Log file is located at: /home/thobson/.azcopy/681c5d69-bacf-1c44-7ea2-9853f73eea67.log
0.0 %, 0 Done, 0 Failed, 1 Pending, 0 Skipped, 1 Total,
Job 681c5d69-bacf-1c44-7ea2-9853f73eea67 summary Elapsed Time (Minutes): 0.0333 Total Number Of Transfers: 1 Number of Transfers Completed: 1 Number of Transfers Failed: 0 Number of Transfers Skipped: 0 TotalBytesTransferred: 128964 Final Job Status: Completed
thobson@paas-in-vnet-uksouth-vm:~$ ls
01_metadata.csv azcopy_linux_amd64_10.3.4 downloadazcopy-v10-linux
thobson@paas-in-vnet-uksouth-vm:~$ head -n 2 01_metadata.csv
COUNTY,SHEET_MAP,SHEET_NO,SHEET,IMAGETHUMB,IMAGEURL,DATES,IMAGE
London,XV.94,"015.94",London XV.94,https://deriv.nls.uk/dcn4/1012/0287/101202873.4.jpg,https://maps.nls.uk/view/101202873,Revised: ca. 1894 - 1895
Published: 1896,"101202873"
## Test from non-Azure computer via public internet
thobson@MAC-ATI0480 temp % ls thobson@MAC-ATI0480 temp % azcopy cp 'https://paasinvnettestwesaoutvn.blob.core.windows.net/sa-blob-public-access-test/01_metadata[0/24] '.' INFO: Scanning...
Job 04b1d0fa-7dd5-3943-4e68-f5a15602084b has started Log file is located at: /Users/thobson/.azcopy/04b1d0fa-7dd5-3943-4e68-f5a15602084b.log
0.0 %, 0 Done, 0 Failed, 1 Pending, 0 Skipped, 1 Total,
Job 04b1d0fa-7dd5-3943-4e68-f5a15602084b summary Elapsed Time (Minutes): 0.0334 Total Number Of Transfers: 1 Number of Transfers Completed: 1 Number of Transfers Failed: 0 Number of Transfers Skipped: 0 TotalBytesTransferred: 128964 Final Job Status: Completed
thobson@MAC-ATI0480 temp % ls 01_metadata.csv
thobson@MAC-ATI0480 temp % head -n 2 01_metadata.csv
COUNTY,SHEET_MAP,SHEET_NO,SHEET,IMAGETHUMB,IMAGEURL,DATES,IMAGE
London,XV.94,"015.94",London XV.94,https://deriv.nls.uk/dcn4/1012/0287/101202873.4.jpg,https://maps.nls.uk/view/101202873,Revised: ca. 1894 - 1895
Published: 1896,"101202873"
thobson@MAC-ATI0480 temp % azcopy cp 'https://paasinvnettestwesainvn.blob.core.windows.net/sa-blob-public-access-test/*
RESPONSE Status: 403 This request is not authorized to perform this operation. Content-Length: [246] Content-Type: [application/xml] Date: [Mon, 06 Apr 2020 22:53:39 GMT] Server: [Microsoft-HTTPAPI/2.0] X-Ms-Error-Code: [AuthorizationFailure] X-Ms-Request-Id: [554290ee-401e-006e-5466-0c970c000000]
thobson@MAC-ATI0480 temp % azcopy list 'https://paasinvnettestwesainvn.blob.core.windows.net/sa-blob-public-access-test/*
failed to traverse container: cannot list blobs. Failed with error -> github.com/Azure/azure-storage-blob-go/azblob.newStorageError, /Users/runner/go/pkg/mod/github.com/!azure/azure-storage-blob-go@v0.7.0/azblob/zc_storage_error.go:42 ===== RESPONSE ERROR (ServiceCode=AuthorizationFailure) ===== Description=This request is not authorized to perform this operation.
## Test from Azure VM not in VNET
thobson@paas-in-vnet-test-uk-south-open-vm:~$ alias azcopy='azcopy_linux_amd64_10.3.4/azcopy' thobson@paas-in-vnet-test-uk-south-open-vm:~$ azcopy cp 'https://paasinvnettestwesaoutvn.blob.core.windows.net/sa-blob-public-access-test/01_metadata.csv' '/home/thobson' INFO: Scanning...
failed to perform copy command due to error: no SAS token or OAuth token is present and the resource is not public
thobson@paas-in-vnet-test-uk-south-open-vm:~$ azcopy login To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code CUCNGWVZM to authenticate.
INFO: Logging in under the "Common" tenant. This will log the account in under its home tenant. INFO: If you plan to use AzCopy with a B2B account (where the account's home tenant is separate from the tenant of the target storage account), please sign in under the target tenant with --tenant-id INFO: Login succeeded.
thobson@paas-in-vnet-test-uk-south-open-vm:~$ azcopy cp 'https://paasinvnettestwesaoutvn.blob.core.windows.net/sa-blob-public-access-test/*' '/home/thobson' INFO: Scanning... INFO: Using OAuth token for authentication.
RESPONSE Status: 403 This request is not authorized to perform this operation. Content-Length: [246] Content-Type: [application/xml] Date: [Mon, 06 Apr 2020 23:19:33 GMT] Server: [Microsoft-HTTPAPI/2.0] X-Ms-Error-Code: [AuthorizationFailure] X-Ms-Request-Id: [e1ffb643-201e-013d-1169-0cdd2f000000]
**NOTE: This is unexpected** The corresponding test from the *inside-VNET* VM (with outbound access allowed) succeeded in downloading the file from the *outside-VNET* storage account.
Re-try with a SAS token:
thobson@paas-in-vnet-test-uk-south-open-vm:~$ azcopy cp 'https://paasinvnettestwesaoutvn.blob.core.windows.net/sa-blob-public-access-test/*
RESPONSE Status: 403 This request is not authorized to perform this operation. Content-Length: [246] Content-Type: [application/xml] Date: [Mon, 06 Apr 2020 23:31:03 GMT] Server: [Microsoft-HTTPAPI/2.0] X-Ms-Error-Code: [AuthorizationFailure] X-Ms-Request-Id: [25770c7f-b01e-0073-236b-0c5e9f000000]
**NOTE: Also unexpected**
Finally, test access to the *inside-VNET* storage account from the *outside-VNET* VM:
thobson@paas-in-vnet-test-uk-south-open-vm:~$ azcopy cp 'https://paasinvnettestwesainvn.blob.core.windows.net/sa-blob-public-access-test/01_metadata.csv
I have tested the above configuration for restricting access to an Azure storage account to be only from VMs within a particular VNET Subnet via a Private Endpoint:
@thobson88 @jemrobinson @ens-brett-todd @warwick26 @getcarter21 With secure links to Azure Storage, I think we can do basic consolidated logging as we can configure Azure Monitor to use a Storage Account rather than a Log Analytics Workspace.
@ens-brett-todd It also looks like connections to Azure Event Hubs can be secured the same way, which looks like the recommended way to forward logging data to an external SIEM system.
/data/
) but each user would have to mount /home/username/data
themselves by explicitly running a mount
command at the command line. This adds a lot of friction wrt. the current UX.See PR #674 for work in progress on this. I think the right answer for mounting Azure storage is a combination of the following.
For blob storage, it looks like blobfuse supports mounting with an SAS token, so we should use this approach to mount containers containing original research data (e.g. the ingress
volume) as read-only. I'd suggest generating a read-only token linked to a policy with an initial long lifetime (say 365 days). We can then easily extend the end date of the policy if required without needing to update the SAS token itself.
For the other shares (shared
, egress
, potentially home
), it looks like there's now support for authenticating Azure Files SMB mounts to a "local" Domain Controller, which should let us authenticate against the SHM DC. This would be cool.
That article doesn't cover mounting the share on Linux, but does say mounting with kerberos works. See this question and answer and this troubleshooting guide for mounting SMB shares via cifs
using autofs
(to mount automatically) and multiuser
(to allow multiple users to mount the same share, but with each authenticating with their own credentials).
It looks like there's now support for authenticating Azure Files SMB mounts to a "local" Domain Controller, which should let us authenticate against the SHM DC.
That article doesn't cover mounting the share on Linux, but does say mounting with kerberos works. See this question and answer and this troubleshooting guide for mounting SMB shares via cifs
using autofs
(to mount automatically) and multiuser
(to allow multiple users to mount the same share, but with each authenticating with their own credentials).
☝️ Thoughts @fedenanni @thobson88 @kevinxufs @jemrobinson @JimMadge ? ☝️
Is the idea here to simplify the structure of SREs by not requiring a dedicated VM for hosting SMB/nfs shares? What is the relative cost and performance compared to a Linux VM for storage?
The lack of support for permissions on blobfuse could cause problems where certain files have expected permissions. Using it for anything which is "just data" like ingress
, egress
should be fine.
My intuition is that mounting the home directory with cifs/smb is a bad idea. When you mount a share you generally select the permissions applied locally to all files and directories in the share (https://jlk.fjfi.cvut.cz/arch/manpages/man/mount.cifs.8#FILE_AND_DIRECTORY_OWNERSHIP_AND_PERMISSIONS). It looks like if you are using samba on Linux you can use 'unix extensions'. However, in that case the uids and gids have to match between the client and server which sounds like a pain to manage. I suspect we have no way of controlling this on an Azure storage account.
@JimMadge : we've asked for access to the Azure NFS preview. This might be the way forward if/when we can use it.
That's interesting, could be a good solution.
I think a more general question to think about (and I expect this has already been considered) is what is the balance of the pros and cons of managed storage vs. running our own storage server.
Managed storage
Dedicated storage server
@JimMadge All good points re the trade-offs for running our own server vs Azure storage. I think the big ones for me are:
I think Azure Blob storage is cheaper than a VM, while Azure File storage is about the same for storage (excluding the server cost, which is quite low in our current no-redundancy option).
This can be done securely using private endpoints.
See PR #674 for work in progress on this.
Once securely available within an SRE via a private endpoint, I think the right answer for mounting Azure storage is a combination of the following.
Blob storage
For blob storage, it looks like blobfuse supports mounting with an SAS token, so we should use this approach to mount containers containing original research data (e.g. the
ingress
volume) as read-only. I'd suggest generating a read-only token linked to a policy with an initial long lifetime (say 365 days). We can then easily extend the end date of the policy if required without needing to update the SAS token itself.File storage
For the other shares (
shared
,egress
, potentiallyhome
), it looks like there's now support for authenticating Azure Files SMB mounts to a "local" Domain Controller, which should let us authenticate against the SHM DC.That article doesn't cover mounting the share on Linux, but does say mounting with kerberos works. See this question and answer and this troubleshooting guide for mounting SMB shares via
cifs
usingautofs
(to mount automatically) andmultiuser
(to allow multiple users to mount the same share, but with each authenticating with their own credentials).