geoschem / GCST-internal

Dummy repo for GCST project management
0 stars 0 forks source link

geoschemdata.computecanada.ca is down #15

Closed LiamBindle closed 3 years ago

LiamBindle commented 3 years ago

Killian reported that geoschemdata.computecanada.ca is down.

I will look into this an post updates here.

LiamBindle commented 3 years ago

I have followed up with our contacts at ComputeCanada

kilicomu commented 3 years ago

Hey @LiamBindle! I'm sure you've got the right URL behind the scenes - it's https://geoschemdata.computecanada.ca/ as opposed to .com

LiamBindle commented 3 years ago

Whoops! Haha thanks for catching that @kilicomu. That was a typo

LiamBindle commented 3 years ago

Our contact at ComputeCanada said he recently rebooted the VM. He said it's up and running but slow to respond because there are a number of IPs continuously donwloading data. He said the unresponsiveness is consistent with their expected usage.

@kilicomu Are you able to try it now?

I'm still having trouble getting to it right now, but @Jun-Meng says he's able to see it right now.

Does anyone know if this unresponsiveness has been an issue in the past?

kilicomu commented 3 years ago

@LiamBindle I'm intermittently getting HTTP request timeouts and long (several minute) response times.

I've never noticed this unresponsiveness in the past, and I semi-regularly download varying quantities of data through that URL. There are times when transfer bandwidth fluctuates, sure, but I can't remember not being able to get a response.

Either way, I'm not sure that their "expected usage" being "users unable get an HTTP response for the top-level directory index" is a reasonable level of expectation...

LiamBindle commented 3 years ago

@kilicomu I agree it isn't a reasonable level of expectation.

For me the server appears to be a lot more responsive this morning. Are you still experiencing issues?

If you are, then I'll push a bit harder and ask if they can investigate further. If you aren't, then I'd suggest we keep an eye on it, and raise it if we see it again. Sorry, I know it's a bit unsatisfying---I don't have much control beyond emailing our point contact. What do you think?

kilicomu commented 3 years ago

It feels a little better today in that requests don't seem to be timing out and are completing in < 60s. I'm not exactly robustly stress-testing the server, though (I figure that might be putting gas on the fire).

I agree with keeping an eye on it. I can't recall it happening before, so hopefully it's just a transient blip.

LiamBindle commented 3 years ago

Glad it's at least responding now. Yeah, let's keep an eye on it.