geneontology / amigo

AmiGO is the public interface for the Gene Ontology.
http://amigo.geneontology.org
BSD 3-Clause "New" or "Revised" License
29 stars 17 forks source link

Runtime errors today #466

Closed ukemi closed 6 years ago

ukemi commented 6 years ago

We are getting a lot of runtime errors today when searching AmiGO2.

kltm commented 6 years ago

It looks like production has been fluttering as bit. Hopefully it has stabilized; it seems to work now anyways. We're trying to get things ticking over here so we can move the systems over.

kltm commented 6 years ago

Well...it may be that we are production at the moment.

@stuartmiyasato Just to check here, it looks like amigo.geneontology.org is now resolving to nakama at Berkeley. Is this a temporary measure for the fluttering or should we be treating it as something more permanent?

stuartmiyasato commented 6 years ago

@kltm I would like it to be more permanent, but you know... :) When it does this, it basically means the production web servers need to be restarted. I'm doing that now. But as you're noticing, I'm spending a lot less time monitoring these systems now -- I have my hands full with other tasks and unfortunately GO is dropping near the bottom of that list now...

kltm commented 6 years ago

@stuartmiyasato No problem and very understandable--we thought that a switch might have happened at some point and we didn't notice or something. For progress, we're just trying to get the indexes independently build now as part of the pipeline and should be able to release you soon. We are finally looking at the timeline in weeks.

stuartmiyasato commented 6 years ago

More relevant to the production behavior... Back a year or two ago when Stanford had a data-center wide power shutdown, we set up DNS for amigo.geneontology.org to have a failover entry. When a healthcheck to the Stanford load balanced address failed, DNS dynamically changed to point amigo.geneontology.org to nakama. After the power maintenance was completed, I just kept that DNS configuration in place in case the Stanford servers ever stopped responding, like what happened earlier today.

kltm commented 6 years ago

Ooo, nice! I hadn't realized it was an automatic failover. That's good to know in case something like this happens again in the next few weeks. We had a bit of a scramble earlier when we realized we were "live" over here, trying to figure out if this was our world now :)

kltm commented 6 years ago

Remainder from a different world.