alan-turing-institute / data-safe-haven

https://data-safe-haven.readthedocs.io
BSD 3-Clause "New" or "Revised" License
50 stars 14 forks source link

[WIP] Update Ubuntu VM images #1909

Open craddm opened 1 month ago

craddm commented 1 month ago

:white_check_mark: Checklist

:vertical_traffic_light: Depends on

:arrow_heading_up: Summary

Updates the Linux VM to a Gen2 VM.

WIP: updates to release xx.04 LTS of Ubuntu

:closed_umbrella: Related issues

Closes #1550

:microscope: Tests

Unable to test if the deployed VMs are fully working, as cannot currently login with a user due to #1908

github-actions[bot] commented 1 month ago

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  data_safe_haven/types
  enums.py
Project Total  

This report was generated by python-coverage-comment-action

craddm commented 1 month ago

Updating to Gen 2 worked fine.

Updating to Jammy is slightly trickier:

2024-05-22 15:05:17 [   ERROR] Diagnostics:                                                                                                                                        cli.py:104
2024-05-22 15:05:17 [   ERROR]   pulumi:pulumi:Stack (data-safe-haven-shm-lincolnshire-sre-morcilla):                                                                                 cli.py:104
2024-05-22 15:05:17 [   ERROR]     error: update failed                                                                                                                          cli.py:104
2024-05-22 15:05:17 [   ERROR]                                                                                                                                                                    cli.py:104
2024-05-22 15:05:17 [   ERROR]   azure-native:compute:VirtualMachineExtension (sre_workspaces_vm_workspace_01_log_analytics_extension):                                               cli.py:104
2024-05-22 15:05:17 [   ERROR]     error: 1 error occurred:                                                                                                                         cli.py:104
2024-05-22 15:05:17 [   ERROR]          * Code="VMExtensionHandlerNonTransientError" Message="The handler for VM extension type 'Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux' has       cli.py:104
reported terminal failure for VM extension 'OmsAgentForLinux' with error message: '[ExtensionOperationError] Non-zero exit code: 55,                                                                        
/var/lib/waagent/Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux-1.19.0/omsagent_shim.sh -install\n\n2024/05/22 15:0Info: Falling back to /etc/os-release distribution parsing\n-1.19.0]              
Install,failed,55,Install failed with exit code 55 because the package manager on the VM is currently locked: please wait and try again\n\n\n\n'.\r\n    \r\n'Install handler failed for the                
extension. More information on troubleshooting is available at https://aka.ms/VMExtensionOMSAgentLinuxTroubleshoot'"     

This seems like more incentive to move off omsagent, if anything.

~as it forces some packages to be installed as snaps (e.g. Firefox), which isn't currently possible~

~Edit: well, may be possible. the I couldn't connect to the workspace VM through xrdp until I manually installed snapd via the serial console...~

jemrobinson commented 1 month ago
  1. Can we switch to noble (24.04)?
  2. @JimMadge: what are your thoughts about snaps?
  3. We could consider installing from a ppa repository if we want to use the .deb version, but it looks like some manual fiddling around with priorities is needed
craddm commented 1 month ago
  1. Can we switch to noble (24.04)?
  2. @JimMadge: what are your thoughts about snaps?
  3. We could consider installing from a ppa repository if we want to use the .deb version, but it looks like some manual fiddling around with priorities is needed

There is a 24.04 image on marketplace. I'd imagine that it too wants to install snaps. Omsagent definitely doesn't support 24.04, but we don't really need it to. I'm not sure if Azure Update Manager or Monitor Agent can handle 24.04 yet either.

JimMadge commented 1 month ago

Hmm, yes I'd forgotten about that but probably should have brought it up before. It feels like Ubuntu is moving towards distributing more packages as snaps. Firefox is now a snap by default.

That will be difficult to support,

My feeling is the drive towards snaps won't change. Now might be the opportunity to move to another distro. Fedora maybe.

JimMadge commented 4 weeks ago

Discussion of snap endpoints in #1220

JimMadge commented 4 weeks ago

Do we still need domains and IP addresses for the endpoints we want to reach (@jemrobinson)?

Using the Snap Store Proxy like we proxy apt/pip/cran could be a good solution.

craddm commented 4 weeks ago
  117.275201] cloud-init[1864]: Selecting previously unselected package libavahi-common-data:amd64.
(Reading database ... 62064 files and directories currently installed.)
[  117.350445] cloud-init[1864]: Preparing to unpack .../000-libavahi-common-data_0.8-5ubuntu5.2_amd64.deb ...
[  117.354326] cloud-init[1864]: Unpacking libavahi-common-data:amd64 (0.8-5ubuntu5.2) ...
[  117.386055] cloud-init[1864]: Selecting previously unselected package libavahi-common3:amd64.
[  117.410072] cloud-init[1864]: Preparing to unpack .../001-libavahi-common3_0.8-5ubuntu5.2_amd64.deb ...
[  117.413341] cloud-init[1864]: Unpacking libavahi-common3:amd64 (0.8-5ubuntu5.2) ...
[  117.461784] cloud-init[1864]: Selecting previously unselected package libavahi-core7:amd64.
[  117.477839] cloud-init[1864]: Preparing to unpack .../002-libavahi-core7_0.8-5ubuntu5.2_amd64.deb ...
[  117.481412] cloud-init[1864]: Unpacking libavahi-core7:amd64 (0.8-5ubuntu5.2) ...
[  117.523809] cloud-init[1864]: Selecting previously unselected package libdaemon0:amd64.
[  117.547648] cloud-init[1864]: Preparing to unpack .../003-libdaemon0_0.14-7.1ubuntu3_amd64.deb ...
[  117.558369] cloud-init[1864]: Unpacking libdaemon0:amd64 (0.14-7.1ubuntu3) ...
[  117.589616] cloud-init[1864]: Selecting previously unselected package avahi-daemon.
[  117.605491] cloud-init[1864]: Preparing to unpack .../004-avahi-daemon_0.8-5ubuntu5.2_amd64.deb ...
[  117.630232] cloud-init[1864]: Unpacking avahi-daemon (0.8-5ubuntu5.2) ...
[  117.675521] cloud-init[1864]: Selecting previously unselected package firefox.
[  117.689962] cloud-init[1864]: Preparing to unpack .../005-firefox_1%3a1snap1-0ubuntu2_amd64.deb ...
[  117.877015] cloud-init[1864]: => Installing the firefox snap
[  117.881777] cloud-init[1864]: ==> Checking connectivity with the snap store
[  117.927838] cloud-init[1864]: ===> Unable to contact the store, trying every minute for the next 30 minutes
craddm commented 4 weeks ago

Looking at loosening network rules to allow snapcraft. However, DNS still not allowing snapcraft domain names to be resolved

.ioadmin@shm-lincolnshire-sre-morcilla-vm-workspace-01:~$ nslookup api.snapcraft 
Server:         127.0.0.53
Address:        127.0.0.53#53

** server can't find api.snapcraft.io: NXDOMAIN

t.orgmin@shm-lincolnshire-sre-morcilla-vm-workspace-01:~$ nslookup cran.r-projec 
Server:         127.0.0.53
Address:        127.0.0.53#53

Non-authoritative answer:
cran.r-project.org      canonical name = cran.wu-wien.ac.at.
Name:   cran.wu-wien.ac.at
Address: 137.208.57.37
** server can't find cran.wu-wien.ac.at: NXDOMAIN

comadmin@shm-lincolnshire-sre-morcilla-vm-workspace-01:~$ nslookup login.ubuntu. 
Server:         127.0.0.53
Address:        127.0.0.53#53

** server can't find login.ubuntu.com: NXDOMAIN

raftcontent.comncolnshire-sre-morcilla-vm-workspace-01:~$ nslookup storage.snapc 
Server:         127.0.0.53
Address:        127.0.0.53#53

** server can't find storage.snapcraftcontent.com: NXDOMAIN
JimMadge commented 4 weeks ago

AdGuard configuration https://github.com/alan-turing-institute/data-safe-haven/blob/5596a995f03be85d30f4c7dd3bdae5b818ddf11a/data_safe_haven/resources/dns_server/AdGuardHome.mustache.yaml

craddm commented 4 weeks ago

Snapcraft is blacklisted by adgaurd

2024/05/29 09:24:31.318180 44#153000 [debug] filtering: found rule "*.*" for host "api.snapcraft.io.3hmqzriazs3edasxc10qfi4p5b.zx.internal.cloudapp.net", filter list id: 0
2024/05/29 09:24:31.318377 44#153000 [debug] dnsforward: host "api.snapcraft.io.3hmqzriazs3edasxc10qfi4p5b.zx.internal.cloudapp.net" is filtered, reason: "FilteredBlackList"; rule: "*.*"
JimMadge commented 4 weeks ago

Permitted domains

https://github.com/alan-turing-institute/data-safe-haven/blob/5596a995f03be85d30f4c7dd3bdae5b818ddf11a/data_safe_haven/types/enums.py#L81-L90

JimMadge commented 4 weeks ago

Having looked at Snap Store Proxy, it looks like it isn't possible to get this working without creating a Ubuntu SSO account (and there may be limits on how many clients can connect without having a Canonical support contract).

docs

craddm commented 4 weeks ago

Ok, allowing the VMs to directly contact snapcraft now works. So finding a way to allow that allows us to use Jammy.

Currently, it creates a new application rule to allow Snapcraft through the firewall.

If Snap Store Proxy won't work, maybe we can use another proxy ourselves -

https://snapcraft.io/docs/system-options#heading--proxy

JimMadge commented 4 weeks ago

As I understand it, we are talking about different methods of proxying here.

The Snap Store Proxy is much more like a snap store instance which has an upstream provider (quite like how we use Nexus).

The snapd proxy configuration is a general purpose http/https proxy, like we route all internet traffic through gateway.example.lan.

craddm commented 4 weeks ago

Ok, but we're just using a squid proxy for apt, so wouldn't something similar work for snapd?

jemrobinson commented 3 weeks ago

My feeling is the drive towards snaps won't change. Now might be the opportunity to move to another distro. Fedora maybe.

Is Fedora supported on Azure? I have no problem with switching to another distro, but we want to avoid maintaining our own OS if possible. I remember you were interested in NixOS a few years ago @JimMadge - is that another possibility?

jemrobinson commented 3 weeks ago

Ok, but we're just using a squid proxy for apt, so wouldn't something similar work for snapd?

We're using squid-deb-proxy which can act as a proxy to any .deb repository. There may be a similar thing for snap packages, or this may just work - I haven't looked into it.

JimMadge commented 3 weeks ago

Is Fedora supported on Azure? I have no problem with switching to another distro, but we want to avoid maintaining our own OS if possible. I remember you were interested in NixOS a few years ago @JimMadge - is that another possibility?

I assumed there would be an official Fedora image, but it looks like there isn't. Debian is endorsed by Microsoft, that should be an easier switch as it is an apt/deb-based distro.

NixOS would be a much more complex change. Moving to an immutable distro would involve changing how we think about configuration. I think it would be a good long term goal. There is an argument that immutable provides better security and reproducibility. There is also no official image on the Marketplace so it would involve building that ourselves.

JimMadge commented 3 weeks ago

We're using squid-deb-proxy which can act as a proxy to any .deb repository. There may be a similar thing for snap packages, or this may just work - I haven't looked into it.

Yes. I think I wrote something like this in Slack and didn't comment here 🤦.

The Snap Store Proxy is like a dedicated proxy for snapd. I don't think it would be a good solution for us though. We could proxy the http/https traffic, but I'm not sure if that gives us anything or improves security in a meaningful way.

JimMadge commented 2 days ago

If this is working, let's get this in now for rc3 and open an issue to track the security question.