Closed larsks closed 3 years ago
move Ceph to rack with backup power. @pjd-nu to finalize placement with NU ITS.
Will need to address #309 during the shutdown.
Data Center email:
During this interval, all equipment in the computer room must be powered down.
Estimated Start Time: Tuesday August 10, 2021 03:00 US/Eastern
Estimated End Time: Tuesday August 10, 2021 21:00 US/Eastern
If there are severe storms forecasted for August 10, we may need to postpone the maintenance day by a week, to August 17. We will provide one week of notice if this turns out to be necessary. Future maintenance days will occur in May/June when severe storms are less likely.
The following equipment will stay powered up during the maintenance interval:
MIT fiber network DWDM
UMASSNET DWDM
MGHPCC public internet access
Building wireless network
Key box in the front lobby
Phone service and cell phone repeater
Minimal lighting in the computer room
Communication
Three communication channels will open at 4AM:
Regular Status updates will be posted at portal.mghpcc.org
MGHPCC staff will be responding to messages on the MGHPCC shutdown Slack channel
MGHPCC staff will be present on an open Zoom session
Key events
Midnight – UPS units off line
3:00 AM — Everyone out of the computer room for the shutdown sequence
8:00 AM — People may return to the computer room
3:00 PM — Everyone out of the computer room for the startup sequence
9:00 PM — Resume normal facility operation
COVID Precautions
These will depend on the status of the pandemic, and will be included in future alerts.
@naved001 can we ensure that end user projects will NOT restart by default?
end user projects
do you mean openstack instances (VMs) in a project?
I think the instances will stay in the state users set it to. So, if it was off before the shutdown it will remain off, and it was running at the time of shutdown it will automatically start.
@msdisme ...so we could e.g. explicitly shut down any running instances before powering things off, if you want to ensure they don't come back up automatically. It would then be up to the end user to power their instances back on.
Data Center announcement is in a comment below - MOC buffers outages by a few days to enable resolution of issues that may occur.
Email sent top MOC users
Hello Mass Open Cloud users,
Due to a scheduled power outage at the MGHPCC data center, MOC services will be unavailable from Monday, August 9, 2021 at 9AM through Thursday, August 12, 2021 at 5PM.
Please shut down your virtual machines, containers, and any bare metal systems by 9 AM on Monday, August 9, so that the Mass Open Cloud team may begin preparing for the outage. If you do not shut them down yourself, you run the risk of losing data.
The MOC has dependencies on several services which also run at the data center. Based on previous experience we recommend not scheduling critical events the week of August 9.
We will notify you when MOC services are available by updating the MOC website at https://massopen.cloud and by sending an email to this distribution list. Once services are back online it will be your responsibility to restart any virtual machines, containers, or other systems.
If you will need access to any MOC hosted data during this outage, please make sure to obtain copies of that data prior to Monday, August 9. During the outage, the data center will be completely without power and access to MOC hosted services will be impossible.
As always, if you have questions feel free to open a ticket at https://support.massopen.cloud. The ticketing system will be available throughout the outage.
Thanks,
Michael Daitzman
PS – If you were forwarded this email from a colleague and you would prefer to receive these notifications directly, you may sign up to the kaizen-users mailing list:: https://mail.massopen.cloud/mailman/listinfo/kaizen-usersThe MGHPCC will be conducting planned facility maintenance on Tuesday August 10, 2021.