USF-IMARS / imars_dags

:leaves: USF IMaRS Airflow DAGs
0 stars 0 forks source link

test airflow worker from out-of-LAN #93

Closed 7yl4r closed 5 years ago

7yl4r commented 5 years ago

Airflow worker running from out-of-LAN is likely to run into a few roadblocks. Here are some likely issues:

7yl4r commented 5 years ago

I think USF-IMARS/server-status#104 is the final holdback on this.

NFS shares from thing2 are mounting as nobody:nobody on imars-airflow21, but cozumel NFS shares work and configuration is identical. NFS shares from thing2 work elsewhere though (tested on my local & other af nodes).

7yl4r commented 5 years ago
# === airflow-21 mount trouble:

# no worky:
thing2.marine.usf.edu: /west_fl_pen /srv/imars-objects/west_fl_pen nfs4 ro,nosuid,nodev,noexec,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=10.0.0.4,     local_lock=none,addr=131.247.136.202 0 0
thing2.marine.usf.edu: /west_fl_pen /srv/imars-objects/west_fl_pen nfs4 ro,nosuid,nodev,noexec,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=10.0.0.4,     local_lock=none,addr=131.247.136.202 0 0
thing2.marine.usf.edu: /A01         /srv/imars-objects/A01         nfs4 ro,nosuid,nodev,noexec,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=10.0.0.4,     local_lock=none,addr=131.247.136.202 0 0

# yes worky:
## airflow21
cozumel.marine.usf.edu:/monroe      /srv/imars-objects/monroe      nfs4 ro,nosuid,nodev,noexec,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=10.0.0.4,     local_lock=none,addr=131.247.136.162 0 0
## tylardesk
thing2master:          /west_fl_pen /srv/imars-objects/west_fl_pen nfs4 rw,nosuid,nodev,noexec,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.1.249,local_lock=none,addr=192.168.1.202   0 0
7yl4r commented 5 years ago

This is getting ridiculous. I just somehow broke ssh logins again.

I am taking a more aggressive approach. Let's copy authconfig from cozumel.

[root@cozumel ~]# authconfig --savebackup=2019-03-13
[root@cozumel ~]# rsync -hazv /var/lib/authconfig/backup-2019-03-13 root@thing2master:/var/lib/authconfig/cozumel-2019-03-13

[root@thing2 ~]# authconfig --savebackup=2019-03-13
[root@thing2 ~]# cd /var/lib/authconfig/
[root@thing2 authconfig]# mkdir manual-2019-03-13
[root@thing2 authconfig]# cp backup-2019-03-13/group manual-2019-03-13/. && cp backup-2019-03-13/passwd manual-2019-03-13/passwd && cp backup-2019-03-13/shadow manual-2019-03-13/. && cp backup-2019-03-13/gshadow manual-2019-03-13/.
[root@thing2 authconfig]# authconfig --restorebackup=manual-2019-03-13 --update

... no change. ssh still broken and still getting nobody:nobody from af-21. :hankey: :angry: :hankey:

7yl4r commented 5 years ago
7yl4r commented 5 years ago

worker is now working. I gave up on ldap & just set up the user manually. Tasks running on imars-airflow-21 are not showing up in the airflow metadata database (result_backend?), but I will follow up on that elsewhere.

7yl4r commented 5 years ago

I resized (& restarted) the VM in Azure. On coming back up files on thing2 were showing as nobody:nobody again.

The Domain in /etc/idmapd.conf had been changed. :thinking: That's a good sign I shouldn't be doing this, but I changed it back:

#Domain=iaordjrjcdketajxumkj5z1keg.bx.internal.cloudapp.net
Domain=marine.usf.edu

Now the airflow group works but the airflow user and imars-common are showing as nobody?

-bash-4.2$ ls -lah /srv/imars-objects/big_bend
total 140K
drwxrwxr-x. 9 nobody nobody  4.0K Mar 26 01:25 .
drwxr-xr-x. 3 root   root       0 Mar 27 21:49 ..
drwxr-xr-x. 2 nobody airflow  48K Mar 27 21:35 ntf_wv2_m1bs
drwxr-xr-x. 2 nobody nobody   20K Sep  4  2018 tif_r_rs_wv2
drwxrwxr-x. 2 nobody nobody  4.0K Jul 30  2018 tif_r_rs_wv2_backup_2018-08-03
drwxrwxr-x. 4 nobody nobody    28 Jun 28  2018 wv2
drwxr-xr-x. 2 nobody airflow  36K Mar 27 21:36 xml_wv2_m1bs
drwxr-xr-x. 2 nobody airflow 4.0K Mar  5 21:26 zip_wv2_ftp_ingest
drwxr-xr-x. 2 nobody airflow 4.0K Mar  5 22:13 zip_wv3_ftp_ingest

cozumel shares still work fine:

-bash-4.2$ ls -lah /srv/imars-objects/gom
total 580K
drwxrwxr-x.   7 airflow imars-common 4.0K Jan  9 22:22 .
drwxr-xr-x.   4 root    root            0 Mar 27 21:56 ..
drwxrwxr-x.   2 airflow imars-common 196K Jan 10 05:53 chlor_a_l3_pass
drwxr-xr-x.   2 airflow airflow       16K Jan 10 13:29 myd01
drwxr-xr-x.   2 airflow airflow      216K Jan 10 15:09 myd0_otis_l2
drwxr-xr-x.   2 airflow airflow      4.0K Jan 10 15:16 s3a_ol_1_efr
drwxr-xr-x. 293 airflow imars-common 8.0K Jan  7 19:46 sst_avhrr_1km

what does that mean?

7yl4r commented 5 years ago

Rebooted and... /etc/imapd.conf is gone?

[imars-admin@imars-airflow-21 etc]$ sudo ls -lh /etc/imapd.conf
ls: cannot access /etc/imapd.conf: No such file or directory

Wat?