ISISComputingGroup / IBEX

Top level repository for IBEX stories
5 stars 2 forks source link

SANS2D: stuck DAE #5023

Open ThomasLohnert opened 4 years ago

ThomasLohnert commented 4 years ago

On Friday night, the DAE on SANS2D became unresponsive. The symptoms included values on the dashboard not updating and users being unable to stop a run. The DAE logs showed NIVisa errors as described here. Seems like it might be the same issue that was observed on TOSCA recently (https://github.com/ISISComputingGroup/IBEX/issues/4851).

Tried restarting SECI initially to see if the end run option would become available, but no luck. Eventually resolved the issue by powercycling the DAE and restarting the VISA server as per the instructions on the wiki linked above. This resulted in losing ~4 hours worth of user data.

We should also make sure the instructions on the wiki are expanded e.g. how to restart the Visa server. Minimal instructions with "ask X to find out more" should not be a thing in our troubleshooting pages.

KathrynBaker commented 4 years ago

It’s also worth noting that there is some information on the SharePoint Wiki/Knowledge Base that is appropriate for things like the DAE, as those items were valid under SECI too.

There’s also some detail in a presentation called ISISICP and DAE which is in the ICPDiscussions document store.

The comment regarding asking Freddie, should probably be a ticket as this may not be something we can resolve and it may require reporting/escalating to the DAE team

ThomasLohnert commented 4 years ago

Good to know. Is there any reason we can't link resources on sharepoint in relevant places in the wiki? Having everything searchable in one place dramatically improves how quickly and easily you can find relevant information, which is important when you have users waiting for you to resolve their issues.

kjwoodsISIS commented 4 years ago

@ThomasLohnert - yes, we should add links to the Sharepoint resources in the wiki. I do this regularly (see, for example, the SANS2D tickets - e.g. at the bottom of #4580).

KathrynBaker commented 4 years ago

What @kjwoodsISIS said, the only reason it won't be linked is that no one has done so yet. (It was SECI support and recalling previous discussions that meant I knew to go looking in the SharePoint).

ChrisM-S commented 4 years ago

Just some thoughts, definitely good to do where possible but:

Please do remember not to identify in the text or make any direct links which might present a public hicking trget to snstive presonal or crummercial information though – even if it is not publicly visiable.

(I would think that) things like manuals and instructions should normally be things that could be linked directly but sometimes it may still be better to name the document and link to the library containing it – word, PowerPoint, PDF for example may be more fiddly to open off the web on some systems (mobiles?) and it can be harder to retrace your steps in SharePoint if you only have the final item link and not the container. Also if new versions appear, it may be easier to find the new version if the link lets you see the location of the file in context and the link will likely be more robust over time.

Can the link checker on the wiki validate links in sharepoint?

FreddieAkeroyd commented 4 years ago

Maybe we should consider having our troubshooting guide in Teams/sharepoint and then linking to relevant parts of the developer wiki from there where necessary?