CCI-MOC / ops-issues

2 stars 0 forks source link

MGHPCC shutdown coming up on Aug 10 #279

Closed larsks closed 3 years ago

larsks commented 3 years ago

Data Center announcement is in a comment below - MOC buffers outages by a few days to enable resolution of issues that may occur.

Email sent top MOC users

Hello Mass Open Cloud users,

Due to a scheduled power outage at the MGHPCC data center, MOC services will be unavailable from Monday, August 9, 2021 at 9AM through Thursday, August 12, 2021 at 5PM.

Please shut down your virtual machines, containers, and any bare metal systems by 9 AM on Monday, August 9, so that the Mass Open Cloud team may begin preparing for the outage. If you do not shut them down yourself, you run the risk of losing data.

The MOC has dependencies on several services which also run at the data center. Based on previous experience we recommend not scheduling critical events the week of August 9.

We will notify you when MOC services are available by updating the MOC website at https://massopen.cloud and by sending an email to this distribution list. Once services are back online it will be your responsibility to restart any virtual machines, containers, or other systems.

If you will need access to any MOC hosted data during this outage, please make sure to obtain copies of that data prior to Monday, August 9. During the outage, the data center will be completely without power and access to MOC hosted services will be impossible.

As always, if you have questions feel free to open a ticket at https://support.massopen.cloud. The ticketing system will be available throughout the outage.

Thanks,

Michael Daitzman

PS – If you were forwarded this email from a colleague and you would prefer to receive these notifications directly, you may sign up to the kaizen-users mailing list:: https://mail.massopen.cloud/mailman/listinfo/kaizen-usersThe MGHPCC will be conducting planned facility maintenance on Tuesday August 10, 2021.

pjd-nu commented 3 years ago

move Ceph to rack with backup power. @pjd-nu to finalize placement with NU ITS.

msdisme commented 3 years ago
msdisme commented 3 years ago

Planned mail being written here - will flag folks for review.

naved001 commented 3 years ago

Will need to address #309 during the shutdown.

msdisme commented 3 years ago

Data Center email:

During this interval, all equipment in the computer room must be powered down.

If there are severe storms forecasted for August 10, we may need to postpone the maintenance day by a week, to August 17. We will provide one week of notice if this turns out to be necessary. Future maintenance days will occur in May/June when severe storms are less likely.

The following equipment will stay powered up during the maintenance interval:

Communication

Key events

COVID Precautions

These will depend on the status of the pandemic, and will be included in future alerts.

msdisme commented 3 years ago

@naved001 can we ensure that end user projects will NOT restart by default?

naved001 commented 3 years ago

end user projects

do you mean openstack instances (VMs) in a project?

I think the instances will stay in the state users set it to. So, if it was off before the shutdown it will remain off, and it was running at the time of shutdown it will automatically start.

larsks commented 3 years ago

@msdisme ...so we could e.g. explicitly shut down any running instances before powering things off, if you want to ensure they don't come back up automatically. It would then be up to the end user to power their instances back on.