ministryofjustice / analytical-platform

Analytical Platform • This repository is defined and managed in Terraform
https://docs.analytical-platform.service.justice.gov.uk
MIT License
12 stars 4 forks source link

✨ Create AWS Data Sync Instance #5175

Open darren1988 opened 3 months ago

darren1988 commented 3 months ago

Describe the feature request.

Describe the context.

We embarked on this originally earlier in the year, where the request came in for a datasync instance that would allow OPG to move various pieces of unstructured/semi-structured data (PDFs, Documents etc.) into the Analytical Platform, so that they could be accessed directly from the AP without having to download files from a fileshare and manually reupload them. This would allow the data to be automatically replicated to our account from the fileshare, meaning analysts would be able to natively access all their files. This was for a good while pending the creation of a service account from ATOS, but said account has been created.

Work required:

We need to create an AWS Datasync Instance, and set it up to connect to/authenticate with the fileshare, using the service account provided by ATOS

Definition of done

YvanMOJdigital commented 3 months ago

requires defined architecture before planning

darren1988 commented 3 months ago

Meeting scheduled for 29/08/24 to discuss scope and technical architecture for this work

bagg3rs commented 2 months ago

I have the service account credentials from Gwion and have put them into 1Password OPG - AWS DataSync Service Account AP Shared Account

darren1988 commented 2 months ago

To be discussed at refinement.

jacobwoffenden commented 2 months ago

Reached out to @ministryofjustice/modernisation-platform about adding their shared platform VPC into our ingestion account

jacobwoffenden commented 2 months ago

Have agreed with @ministryofjustice/modernisation-platform that this isn't a problem, we can add shared VPC, will inspect environment code in modernisation-platform and modernisation-platform-environments

jacobwoffenden commented 2 months ago

shared VPC added to ingestion account, however upon further reading Data Sync does not support shared VPCs

jacobwoffenden commented 2 months ago

Plan is to create VPCs using existing, soon to be retired, never connected to MoJ TGW, ranges from MP

jacobwoffenden commented 1 month ago

VPC build-out in progress, EC2 instance build-out also in progress.

However the DataSync registration is not programatic, the DataSync server needs to be accessible from whatever machine is running Terraform/registering manually in the console. This is problematic because

1) GitHub Actions is our primary CI/CD system that has no internal connectivity to our VPC and isn't really in scope for making work because its MP's system 1) Our VPC currently has no public connectivity 1) From what I've read, the AWS provided AMI does not include SSM agent, instead need you need to connect via SSH (I tried serial console but the default of admin / password didn't work) 1) The above presents a chicken/egg problem 🤔

Do I open the endpoint to GitHub Actions? GlobalProtect?

Do I add userdata to install SSM agent and write the activation key to Secrets Manager? I don't even know if the activation key is held on disk or if I'd to run a command...

jacobwoffenden commented 1 month ago

10/10/24 update:

jacobwoffenden commented 1 month ago

16/10/24 update:

TODO:

jacobwoffenden commented 1 month ago

Currently blocked by https://github.com/ministryofjustice/modernisation-platform/issues/8275

darren1988 commented 1 month ago

Requested support from mod platform to help unblock this ticket

jacobwoffenden commented 1 month ago

NVVS/LAN&Wifi team have given me access to https://github.com/ministryofjustice/deployment-tgw, so I'm not as blocked as last week 🙏

jacobwoffenden commented 1 month ago

Moving back to blocked pending information on connecting to DOM1 from AWS

jacobwoffenden commented 1 month ago

Blocked while https://github.com/ministryofjustice/modernisation-platform/pull/8322 is deployed

jacobwoffenden commented 3 weeks ago

31/10/24 update (spooky edition 🎃):

jacobwoffenden commented 3 weeks ago

⬆ pull request added staff device production vpc to mod platform tgw route table, still not responding...

traffic is arriving in destination vpc, but return path from ark might not be working, need to reach out further

jacobwoffenden commented 3 weeks ago

This pull request (https://github.com/ministryofjustice/deployment-tgw/pull/258) has allowed us to lookup dom1.infra.int via MoJO DNS resolver

jacobwoffenden commented 3 weeks ago

Blocked again as we liaise with other parties about direct connection into Ark over TGW.

It can route back, we just cant route there, presumably because its blocked by a Palo Alto

jacobwoffenden commented 2 weeks ago

🎉 I am able to connect to DOM1 from my debugging instance! 🎉

jacobwoffenden commented 2 weeks ago

Reached out to ATOS because I can't access one of the locations

jacobwoffenden commented 2 weeks ago

Have reached out to @gwionap for clarification on source data

jacobwoffenden commented 2 weeks ago

Updated locations received from @gwionap, will continue.

jacobwoffenden commented 1 week ago

Have created a task but is failing...

Image

I can't explore this location with smbclient from the debug instance either

smb: \> ls hq/PGO/Shared/Group
do_connect: Connection to eucw4171nas002.dom1.infra.int failed (Error NT_STATUS_IO_TIMEOUT)
Unable to follow dfs referral [\eucw4171nas002.dom1.infra.int\mojshared002$]
do_list: [\hq\PGO\Shared\Group] NT_STATUS_IO_TIMEOUT

have escalated to @gwionap

jacobwoffenden commented 1 week ago

A more verbose output from smbclient

smb: \> ls hq/PGO/Shared/Group/
dos_clean_name [\hq\PGO\Shared\Group\]
unix_clean_name [\hq\PGO\Shared\Group\]
signed SMB2 message (sign_algo_id=1)
signed SMB2 message (sign_algo_id=1)
sitename_fetch: No stored sitename for realm ''
namecache_fetch: no entry for eucw4171nas002.dom1.infra.int#20 found.
resolve_hosts: Attempting host lookup for name eucw4171nas002.dom1.infra.int<0x20>
namecache_store: storing 1 address for eucw4171nas002.dom1.infra.int#20: 10.172.69.24
Connecting to 10.172.69.24 at port 445
convert_string_handle: E2BIG: convert_string(UTF-8,CP850): srclen=30 destlen=16 error: No more room
Connecting to 10.172.69.24 at port 139
do_connect: Connection to eucw4171nas002.dom1.infra.int failed (Error NT_STATUS_IO_TIMEOUT)
Unable to follow dfs referral [\eucw4171nas002.dom1.infra.int\mojshared002$]
do_list: [\hq\PGO\Shared\Group\] NT_STATUS_IO_TIMEOUT
signed SMB2 message (sign_algo_id=1)
signed SMB2 message (sign_algo_id=1)
signed SMB2 message (sign_algo_id=1)
signed SMB2 message (sign_algo_id=1)
signed SMB2 message (sign_algo_id=1)

Seeing the following in VPC flow logs

2 730335344807 eni-05ad72cf6a35649b0 10.26.128.43 10.172.69.24 33266 139 6 3 180 1731627738 1731627767 ACCEPT OK

So maybe its the routing back from ATOS?

jacobwoffenden commented 6 days ago

SMB traffic is being dropped at the Palo Altos 💀

jacobwoffenden commented 2 days ago

@bagg3rs has raised a demand with Tech Services