microsoft / service-fabric

Service Fabric is a distributed systems platform for packaging, deploying, and managing stateless and stateful distributed applications and containers at large scale.
https://docs.microsoft.com/en-us/azure/service-fabric/
MIT License
3.03k stars 399 forks source link

Enhancement: Show console error output on :19080 monitor page. #909

Open qrli opened 7 years ago

qrli commented 7 years ago

The <SetupEntryPoint> and <EntryPoint> may encounter some error which cannot be logged by application, which happens before application initialize logging component. They are only captured by Service Fabric and written to console logs if enabled. While we can remote desktop to the VM and find the log, it is really time consuming on slow connection, only to find small errors. This is especially annoying when you have many teams deploying to the same cluster; many people need to remote to the VMs.

It is far better if the monitor page simply show the reason (some piece of the std err output), other than only process return code. Or provide a download link for the console logs.

mikkelhegn commented 7 years ago

@qrli: seems some of your feedback was cut in editing?

qrli commented 7 years ago

@MikkelHegn Not caused by me, but github. It seems github starts to filter out <...>...

masnider commented 7 years ago

@qrli Most of these should be showing up as health messages, which are visible in SFX. If they're not, can you give us an example of such an error and the log that helped you figure it out?

qrli commented 7 years ago

@masnider Thanks for looking at this. The health messages include a good amount of info, but not for guest executable nor setup entry point script. I can only see there that an entry point or setup entry point failed, but I don't see why.

E.g.: I have a setup entry point script, which works locally but failed on Azure. One common cause for us is that we have several clusters created at different times with different versions of ARM and VM setup script, so the environments may be slightly different. Then, we have to remote desktop (need asking for permission to access) to the cluster and look at the captured console error logs. The error may be missing dependencies, incorrect/expired certificates, failing to connect to database or other services, conflicts with another application, etc.

mikkelhegn commented 7 years ago

@qrli: @cwe1ss wrote a good SO response to this here: https://stackoverflow.com/questions/37885213/azure-service-fabric-activation-error

qrli commented 7 years ago

@MikkelHegn That's not a solution but more of an example of this issue. RDP to the VM to see the log is what exactly we want to avoid. When you have many teams, you will sure want to control the access to the VMs, to avoid somebody messing up the environment. But for now, each team has to RDP to the VMs only to view the console error logs. It will be much easier if the health message included the console error output instead of just "entry point failed to start".

mikkelhegn commented 7 years ago

@qrli I agree - the intention was to give a pointer to help you with the issue at hand. We've know this is an enhancement we need to make to the product / tooling around SF.

craftyhouse commented 3 years ago

@jeffj6123 do you know if we bubble this info up in to SFX now?

jeffj6123 commented 3 years ago

@craftyhouse we would likely need to ask hosting about how to capture this information and bubble it up