CVNRneuroimaging / infrastructure

Issue tracking, system documentation and configs for operations side of the neuroimaging core @ Atlanta VA CVNR / Emory University
3 stars 2 forks source link

configuring pano as stand-alone workstation #138

Open stowler opened 9 years ago

stowler commented 9 years ago

Keith and Rob @kmcgregor123456 and @rrmm,

FYI Today I upgraded pano in situ from 12.04 to 14.04 as the initial step in configuring it as a stand-alone workstation.

It's available to be powered down, as Keith is going to replace its battery today/tonight and add storage.

I'll use this issue to update you if I make an significant config changes to it after Keith's post-battery reboot.

Thanks, Stephen

Monday the 17th:

no work with pano today...rama only

Monday the 24th:

No work from me, leaving Rob free to troubleshoot and reboot as necessary

Monday the 31st through Tuesday the 1st

stowler commented 9 years ago

Keith replaced UPS battery and installed 2TB drive mounted as /data/panolocal:

[08:25:08]-[stowler-local]-at-[pano]-in-[~]
$ df -h /dev/sdb1
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb1       1.8T  2.7G  1.7T   1% /data/panolocal

[08:26:09]-[stowler-local]-at-[pano]-in-[~]
$ uptime
 20:26:20 up 56 min,  1 user,  load average: 0.19, 0.14, 0.19
stowler commented 9 years ago

FYI: configuring and testing pano after Monday's upgrade from 12.04 to 14.04. Committing my notes here:

https://github.com/CVNRneuroimaging/infrastructure/blob/master/config/notes/20150818-stowlerBircSystemConfig.md

stowler commented 9 years ago

FYI: still configuring and testing pano...ending the day having installed FIX but not yet tested it.

Slowed by longer-than-normal R package downloads and compile times.

My notes were all committed to the link above.

stowler commented 9 years ago

FYI: continuing config and testing of pano (melodic, fix, fslnets). Committing today's notes here:

https://github.com/CVNRneuroimaging/infrastructure/blob/master/config/notes/20150819-stowlerBircSystemConfig.md

stowler commented 9 years ago

Completed MELODIC testing and inspection in preparation for FIX testing. Will launch initial FIX tests before the night is out.

stowler commented 9 years ago

FYI: still configuring and testing pano...ending the day having finished MELODIC single-session testing and launched (but didn't inspect) FIX testing. Will resume FIX tmw, followed by melview, group MELODIC, and fslnets.

My notes from today were all committed to the link above.

stowler commented 9 years ago

Will be committing today's (Thursday's) notes here:

https://github.com/CVNRneuroimaging/infrastructure/blob/master/config/notes/20150820-stowlerBircSystemConfig.md

...starting with 7am launch of loop of serial FIX operations that should finish before 1p. (Generating known-good test data for MELODIC group ICA).

stowler commented 9 years ago

FYI: still configuring and testing pano...ending the day having completed testing and inspection of FSL FIX, and launched (but didn't inspect) test of MELODIC group ICA. Will inspect results tmw, followed by more config and testing.

My notes from today were all committed to the link above. -ST, Thurs night

stowler commented 9 years ago

FYI: Will be committing today's (Friday's) notes here:

https://github.com/CVNRneuroimaging/infrastructure/blob/master/config/notes/20150821-stowlerBircSystemConfig.md

...starting with review of last night's gica results. More pano config and testing today, with another overnight gica tonight.

stowler commented 9 years ago

Hi Keith @kmcgregor123456 (cc: Rob @rrmm) ,

Today will be my last chance for reboots of pano and rama before multi-day processing runs, so I just want to follow-up on the physical configs we talked about in last Thursday's infrastructure meeting:

Thanks, Stephen

kmcgregor123456 commented 9 years ago

We will run the emergency power at a later date.

The storage will have to be sufficient for the short term since I have yet to receive the new disks.

stowler commented 9 years ago

@kmcgregor123456 OK. Maybe you didn't see my original questions:

-ST

kmcgregor123456 commented 9 years ago

There is nothing novel about the power needs in the room. If you want to switch to the Emergency outlets, then that's fine with me.

I didn't realize that I had put it on the USB2 bus. It was reporting USB3 speeds and USB3 on dmesg at boot. I did a test transfer and it was

May

stowler commented 9 years ago

Re-cabled pano:

stowler commented 9 years ago

Keith @kmcgregor123456 (cc: Rob @rrmm ),

I need to backup < 100 gigs from panolocal to hippoback this afternoon, rsyncing as stowler-local@hippoback.

What hippoback destination directory would you like me to use, and will that change as my backup grows to 500 - 900 GB this weekend?

Thx, ST

kmcgregor123456 commented 9 years ago

/data/backup/atlanta/stowlerxfer082115

Best,


Keith McGregor, PhD Atlanta VA CVNR Emory University www.varrd.emory.eduhttp://ww.varrd.emory.edu

On Aug 21, 2015, at 4:42 PM, Stephen Towler notifications@github.com<mailto:notifications@github.com> wrote:

Keith @kmcgregor123456https://github.com/kmcgregor123456 (cc: Rob @rrmmhttps://github.com/rrmm ),

I need to backup < 100 gigs from panolocal to hippoback this afternoon, rsyncing as stowler-local@hippobackmailto:stowler-local@hippoback.

What hippoback destination directory would you like me to use, and will that change as my backup grows to 500 - 900 GB this weekend?

Thx, ST

— Reply to this email directly or view it on GitHubhttps://github.com/CVNRneuroimaging/infrastructure/issues/138#issuecomment-133556479.


This e-mail message (including any attachments) is for the sole use of the intended recipient(s) and may contain confidential and privileged information. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this message (including any attachments) is strictly prohibited.

If you have received this message in error, please contact the sender by reply e-mail message and destroy all copies of the original message (including attachments).

stowler commented 9 years ago

Thx. Also from my question: "...will that change as my backup grows to 500 - 900 GB this weekend?"

kmcgregor123456 commented 9 years ago

there is sufficient space to accommodate up to 5TB

Keith McGregor, PhD VA RR&D Atlanta CoE Emory University 352.359.8084 www.varrd.emory.edu


From: Stephen Towler [notifications@github.com] Sent: Friday, August 21, 2015 4:49 PM To: CVNRneuroimaging/infrastructure Cc: Keith McGregor Subject: Re: [infrastructure] configuring pano as stand-alone workstation (#138)

Thx. Also from my question: "...will that change as my backup grows to 500 - 900 GB this weekend?"

— Reply to this email directly or view it on GitHubhttps://github.com/CVNRneuroimaging/infrastructure/issues/138#issuecomment-133559892.


This e-mail message (including any attachments) is for the sole use of the intended recipient(s) and may contain confidential and privileged information. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this message (including any attachments) is strictly prohibited.

If you have received this message in error, please contact the sender by reply e-mail message and destroy all copies of the original message (including attachments).

stowler commented 9 years ago

Thx. Initiated rsync from the hippostore side. No action needed by just FYI topping out at 33 MB/s:

stowler-local@hippoback:/data/backup/Atlanta/stowlerxfer082115$ rsync -avR --progress stowler-local@pano.birc.emory.edu:/data/panolocal .

Password:
receiving incremental file list
data/
data/panolocal/
data/panolocal/tempStowler/
data/panolocal/tempStowler/melFromFeeds-afterFix.tar
  3500687360 100%   32.08MB/s    0:01:44 (xfer#1, to-check=1032/1036)

# ...yes, USB3 now unless this is a reporting error:
[05:01:48]-[stowler-local]-at-[pano]-in-[/data]
$ lsusb -t
/:  Bus 10.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 5000M
    |__ Port 1: Dev 2, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
/:  Bus 09.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 480M
/:  Bus 08.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
/:  Bus 07.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
/:  Bus 06.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
/:  Bus 05.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
    |__ Port 1: Dev 2, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M
    |__ Port 1: Dev 2, If 1, Class=Human Interface Device, Driver=usbhid, 1.5M
    |__ Port 2: Dev 3, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M
/:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
/:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=ehci-pci/8p, 480M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=ehci-pci/4p, 480M
kmcgregor123456 commented 9 years ago

Sounds about right for that connection per my recent tests

Best,


Keith McGregor, PhD Atlanta VA CVNR Emory University www.varrd.emory.eduhttp://ww.varrd.emory.edu

On Aug 21, 2015, at 5:09 PM, Stephen Towler notifications@github.com<mailto:notifications@github.com> wrote:

Thx. Initiated rsync from the hippostore side. No action needed by just FYI topping out at 33 MB/s:

stowler-local@hippoback:/data/backup/Atlanta/stowlerxfer082115$ rsync -avR --progress stowler-local@pano.birc.emory.edumailto:stowler-local@pano.birc.emory.edu:/data/panolocal .

Password: receiving incremental file list data/ data/panolocal/ data/panolocal/tempStowler/ data/panolocal/tempStowler/melFromFeeds-afterFix.tar 3500687360 100% 32.08MB/s 0:01:44 (xfer#1, to-check=1032/1036)

...yes, USB3 now unless this is a reporting error:

[05:01:48]-[stowler-local]-at-[pano]-in-[/data] $ lsusb -t /: Bus 10.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 5000M | Port 1: Dev 2, If 0, Class=Mass Storage, Driver=usb-storage, 5000M /: Bus 09.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 480M /: Bus 08.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M /: Bus 07.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M /: Bus 06.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M /: Bus 05.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M | Port 1: Dev 2, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M | Port 1: Dev 2, If 1, Class=Human Interface Device, Driver=usbhid, 1.5M | Port 2: Dev 3, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M /: Bus 04.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M /: Bus 03.Port 1: Dev 1, Class=root_hub, Driver=uhci_hcd/2p, 12M /: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=ehci-pci/8p, 480M /: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=ehci-pci/4p, 480M

— Reply to this email directly or view it on GitHubhttps://github.com/CVNRneuroimaging/infrastructure/issues/138#issuecomment-133564355.


This e-mail message (including any attachments) is for the sole use of the intended recipient(s) and may contain confidential and privileged information. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this message (including any attachments) is strictly prohibited.

If you have received this message in error, please contact the sender by reply e-mail message and destroy all copies of the original message (including attachments).

stowler commented 9 years ago

FYI: will be committing today's (Saturday's) pano config and testing notes here:

https://github.com/CVNRneuroimaging/infrastructure/blob/master/config/notes/20150822-stowlerBircSystemConfig.md

stowler commented 9 years ago

Keith and Rob @kmcgregor123456 @rrmm : please check pano's logs and zabbix plots. I left WMB last night with perfectly functional lightdm showing on the console and now I'm back in WMB for GUI-intensive work, staring at a blank and unresponsive console including unresponsive capslock light.

stowler commented 9 years ago

Keith and Rob @kmcgregor123456 @rrmm : I spoke to Keith on the phone, and he advised that I reboot pano (instead of bouncing runlevel) as the log pollution is a fair trade-off for having a functional console quickly. Rob/Keith please check pano's logs and plots between 6pm yesterday (the 21st) and this reboot:

[01:57:35]-[stowler-local]-at-[pano]-in-[~]
$ df -h
Filesystem                             Size  Used Avail Use% Mounted on
/dev/sda1                              412G   84G  307G  22% /
none                                   4.0K     0  4.0K   0% /sys/fs/cgroup
udev                                    24G  4.0K   24G   1% /dev
tmpfs                                  4.8G  1.6M  4.8G   1% /run
none                                   5.0M     0  5.0M   0% /run/lock
none                                    24G   72K   24G   1% /run/shm
none                                   100M   12K  100M   1% /run/user
/dev/sdb1                              1.8T   34G  1.7T   2% /data/panolocal
hippoback.birc.emory.edu:/data/backup   32T   15T   15T  51% /data/backup
corpus.birc.emory.edu:/export/users    1.8T  304G  1.4T  18% /net

[01:58:06]-[stowler-local]-at-[pano]-in-[~]
$ uptime
 13:59:15 up 20:25,  1 user,  load average: 0.00, 0.01, 0.05

# ...performed backups before reboot, then:

[02:34:48]-[stowler-local]-at-[pano]-in-[~]
$ sudo shutdown -r now
[sudo] password for stowler-local:
stowler commented 9 years ago

Keith and Rob: I saw pano's lightdm die in front of me while I was working on rama. Here's a little log info...please troubleshoot.

/cc @kmcgregor123456 @rrmm

dmesg says:

[ 1964.547957] traps: lightdm-gtk-gre[2113] trap int3 ip:7f6e0d5c6c13 sp:7fff14395830 error:0
[ 1973.216451] init: lightdm main process (1835) terminated with status 1

...and /var/log/syslog says:

Aug 22 16:21:11 pano kernel: [ 1964.547957] traps: lightdm-gtk-gre[2113] trap int3 ip:7f6e0d5c6c13 sp:7fff14395830 error:0
Aug 22 16:21:19 pano kernel: [ 1973.216451] init: lightdm main process (1835) terminated with status 1

...other recently touched logs in /var/log...

-rw-rw-r--  1 root              utmp         182016 Aug 22 16:14 wtmp
-rw-rw-r--  1 root              utmp   483876826640 Aug 22 16:14 lastlog
-rw-r-----  1 syslog            adm         1556202 Aug 22 16:21 kern.log
-rw-r-----  1 syslog            adm          536007 Aug 22 16:21 auth.log
-rw-r--r--  1 root              root          28078 Aug 22 16:21 Xorg.0.log
-rw-r-----  1 root              adm            1258 Aug 22 16:21 apport.log
-rw-r-----  1 syslog            adm          311259 Aug 22 16:21 syslog

...lightdm logs:

[04:37:32]-[stowler-local]-at-[pano]-in-[/var/log/lightdm]
$ ls -altr
total 76
-rw-------  1 root root     688 Jan 16  2015 x-2-greeter.log.old
-rw-------  1 root root     692 May  8 16:37 x-2-greeter.log
-rw-------  1 root root    1113 May  8 16:37 x-2.log
-rw-------  1 root root    1754 Aug 21 16:32 x-1-greeter.log.old
-rw-------  1 root root    2439 Aug 21 17:31 x-1.log.old
-rw-------  1 root root     137 Aug 22 14:56 x-0-greeter.log.old
-rw-------  1 root root    2439 Aug 22 15:46 x-1.log
-rw-------  1 root root    2439 Aug 22 15:47 x-0.log.old
-rw-------  1 root root   14760 Aug 22 15:47 lightdm.log.old
-rw-------  1 root root    1327 Aug 22 15:47 x-1-greeter.log
drwxrwxr-x 23 root syslog  4096 Aug 22 15:49 ..
drwxr-xr-x  2 root root    4096 Aug 22 15:49 .
-rw-------  1 root root    1754 Aug 22 16:21 x-0-greeter.log
-rw-------  1 root root    3795 Aug 22 16:21 x-0.log
-rw-------  1 root root    6537 Aug 22 16:21 lightdm.log

...and the most recent lines from lightdm.log:

[+1496.45s] DEBUG: User /org/freedesktop/Accounts/User1657021384 changed
[+1496.60s] DEBUG: User /org/freedesktop/Accounts/User1000 changed
[+1496.60s] DEBUG: User /org/freedesktop/Accounts/User1657022678 changed
[+1496.60s] DEBUG: User /org/freedesktop/Accounts/User1657021299 changed
[+1496.60s] DEBUG: User /org/freedesktop/Accounts/User1657093944 changed
[+1496.61s] DEBUG: User /org/freedesktop/Accounts/User1003 changed
[+1496.61s] DEBUG: User /org/freedesktop/Accounts/User1002 changed
[+1496.61s] DEBUG: User /org/freedesktop/Accounts/User1656893601 changed
[+1894.32s] DEBUG: Session pid=2096: Greeter closed communication channel
[+1894.32s] DEBUG: Session pid=2096: Exited with return value 0
[+1894.32s] DEBUG: Seat: Session stopped
[+1894.32s] DEBUG: Seat: Stopping; failed to start a greeter
[+1894.32s] DEBUG: Seat: Stopping
[+1894.32s] DEBUG: Seat: Stopping display server
[+1894.32s] DEBUG: Sending signal 15 to process 1848
[+1894.32s] DEBUG: Seat: Stopping session
[+1894.32s] DEBUG: Session pid=2531: Sending SIGTERM
[+1894.32s] DEBUG: Session pid=2531: Terminated with signal 15
[+1894.32s] DEBUG: Session: Failed during authentication
[+1894.32s] DEBUG: Seat: Session stopped
[+1899.32s] DEBUG: Sending signal 9 to process 1848
[+1899.47s] DEBUG: Process 1848 terminated with signal 9
[+1899.47s] DEBUG: DisplayServer x-0: X server stopped
[+1899.47s] DEBUG: Releasing VT 7
[+1899.47s] DEBUG: DisplayServer x-0: Removing X server authority /var/run/lightdm/root/:0
[+1899.47s] DEBUG: Seat: Display server stopped
[+1899.47s] DEBUG: Seat: Stopped
[+1899.47s] DEBUG: Required seat has stopped
[+1899.47s] DEBUG: Stopping display manager
[+1899.47s] DEBUG: Display manager stopped
[+1899.47s] DEBUG: Stopping daemon
[+1899.47s] DEBUG: Exiting with return value 1
stowler commented 9 years ago

FYI: will be committing today's (Sunday's) pano config and testing notes here:

https://github.com/CVNRneuroimaging/infrastructure/blob/master/config/notes/20150823-stowlerBircSystemConfig.md

stowler commented 9 years ago

FYI: will be committing today's (Monday's) pano config and testing notes here:

https://github.com/CVNRneuroimaging/infrastructure/blob/master/config/notes/20150824-stowlerBircSystemConfig.md

Opened grid engine ticket here: https://github.com/CVNRneuroimaging/infrastructure/issues/144

stowler commented 9 years ago

FYI: will be committing today's (Tuesday's) pano config and testing notes here:

https://github.com/CVNRneuroimaging/infrastructure/blob/master/config/notes/20150825-stowlerBircSystemConfig.md

stowler commented 9 years ago

Messaged Rob in the console ticket:

Hi Rob. If it's OK with you, later tonight I'm going to take over pano and rama until Thursday morning. I need them for Tuesday and Wednesday data reviews, and am going to start mount, read, and prep tonight.

stowler commented 9 years ago

No action needed, just FYI: this morning the 2TB drive that Keith mounted as /data/panolocal has started something akin to the "click-of-death" sound but more like intermittent short whine sounds with that same spacing.

No data loss, as everything was backed up to hippoback. Maybe a random hiccup...reformatting as exfat.

(manufacture date 25 May 2012).