bcssov / IronyModManager

Mod Manager for Paradox Games. Official Discord: https://discord.gg/t9JmY8KFrV
https://bcssov.github.io/IronyModManager/
MIT License
356 stars 41 forks source link

IMM Reaches 100% on Conflict Solver, and hangs (Linux) #177

Closed lxaphir closed 3 years ago

lxaphir commented 3 years ago

Describe the bug IMM passes all processing steps after Conflict Resolver button is hit, but never makes it off the shaded 100% progress screen. Otherwise it loads Stellaris normally/successfully and will load the mods as expected. Everything works OK except end of Conflict Resolver process.

To Reproduce Need Tumbleweed installation with working Steam and Stellaris, then run IMM. For proper problem replication you would need this system I am working from.

Expected behavior Just need to make it past the 100% screen to conflict resolution panes.

Screenshots If applicable, add screenshots to help explain your problem. http://www.enlightenment.org/ss/e-5fead80b079460.72421673.jpg

Logs It does not crash, is very stable but I can't get clues on the 100% screen hang. I would be glad to examine the logs for this app if I knew the location of the logs. Wasn't able to find them when I looked

Version:

Additional context I believe that Irony is attempting to finalize it's processing using a command normally available in Linux, but which might not be available in this minimalist install. It would be helpful to me to know what Irony is doing when it reaches the end of it's Conflict Resolver processing.

Compression? Copy? Parsing? Can we check logs to see what it is doing?

bcssov commented 3 years ago

Irony is deleting and verifying that files in the patch match the latest data. Checking whether that things like file hashes match, files exist and is dumping files that are "auto" resolved. Basically lots of going on in the irony patch directory.

Since Irony is not crashing there will be no logs and only way is to setup a system using the configuration you mentioned to see why this is happening. Given that I've tested already Irony (including conflict solver) on 4-5 various distros now testing on another one is too much for me. I know it's not what you wanted to hear but I have no time to debug another distro.

lxaphir commented 3 years ago

What is it using to check file hashes?

lxaphir commented 3 years ago

Also, just fyi -- If you read my post I've made it clear that I'm looking for log information. Not really asking you to do anything on your part (like test on a distro), because as a break fix engineer I know it's going to be next to impossible to duplicate the conditions, which is why I said what I said in the issue post. I'd really like to fix this myself. But for that, I do need log information. As break fix engineers we rely on log info to debug and fix things all the time, but it's difficult to do with an application that does not log itself. But very few applications do not log themselves; they are in the minority; probably because in the real world no break fix engineers or software devs can debug the app without knowing what it is doing. So my ask is this; I need this app to log itself. That's so we can fix it and you don't have to. I realize this is the only bug post on the site so far and it's not a shiny feature request, but as the only bug post so far on this project is it possible to get a log.

bcssov commented 3 years ago

Irony is using MetroHash (https://github.com/jandrewrogers/MetroHash) which is a .NET Core native library. No external dependencies are needed as the app itself is self contained and was compiled as such so you don't even need net core installed on Linux. The only dependency that might be needed in Linux is maybe xdg as it asks it to resolve environment variables and launch external processes.

And as I already told you if there are no crashes Irony does not log in depth anything (as many windows apps do actually). What I do log are actual exception logs and I rely on these to fix bug reports. There have been a couple of reported freezes in some Linux distros and all problems with them are tied to upstream issues related to Avalonia and underlying library that Avalonia uses (which you most likely cannot fix yourself). If you really want its trace info you can check the nlog config and set debug info to trace or info which will then make Irony write its output to logs folder. As for Irony logging it will not dump its info via trace as I never implemented an interceptor which dumps trace or info to the logs only exceptions as I mentioned.

Anyway for more info as to what is happening is that Irony is finishing the Syncing Patches in the service layer https://github.com/bcssov/IronyModManager/blob/495ae10ed101d1c36c7bab4d7e9bc7df3ef7ff31/src/IronyModManager.Services/ModPatchCollectionService.cs#L1191-L1206 and ViewModel thread awaiter is signaling via MessageBus that it's done and Irony can skip to the next step which is to pretty much switch the view which in your case is not happening https://github.com/bcssov/IronyModManager/blob/495ae10ed101d1c36c7bab4d7e9bc7df3ef7ff31/src/IronyModManager/ViewModels/Controls/ModHolderControlViewModel.cs#L407-L423 https://github.com/bcssov/IronyModManager/blob/495ae10ed101d1c36c7bab4d7e9bc7df3ef7ff31/src/IronyModManager/ViewModels/MainWindowViewModel.cs#L229-L248 The problem for you lies somewhere in the upstream which does not trigger when I'm asking Avalonia to show the next view.

lxaphir commented 3 years ago

"There have been a couple of reported freezes in some Linux distros and all problems with them are tied to upstream issues related to Avalonia and underlying library that Avalonia uses"

What's the name of the Avalonia library that the freeze is attributed to?

bcssov commented 3 years ago

Don't remember. The speculation that Avalonia is to blame here is merely a speculation; until I get around to it sometime to check it out. Don't know when that will be but I will keep this ticket open.

lxaphir commented 3 years ago

bcssov, here is the nlog config file. How do we set debug info to "trace" or "info"?

https://susepaste.org/29747844

bcssov commented 3 years ago

https://github.com/NLog/NLog/wiki/Configuration-file#log-levels https://github.com/bcssov/IronyModManager/blob/master/src/IronyModManager/nlog.config

Though if nothing is logged let me know in another ticket, I might have broken something in recent updates. Haven't tested trace\info logging in months in release build configuration.

lxaphir commented 3 years ago

Hey bcssov, thanks for the nlog info. I need to let you know that the Merge function works; I'm using the mod it created currently with no problems. The mod size that it created was 1020 mb. Before I move on to nlog activity, I have some strace output for you. The following output was during a 100% message with the window hung. The command was run as follows:

"strace -f --status=failed ./IronyModManager | uniq"

Uniq eliminates duplicates. In this case, with the above qualifiers, strace only outputs system calls that have failed. This is a small sample grab; fyi, there was a very large number of futex errors:

[pid 141] futex(0x7f26b8002970, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable) [pid 109] futex(0x564dfc36ba50, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=134355571}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 162] futex(0x7f2684004170, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable) [pid 109] futex(0x564dfc36ba54, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=136920505}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 109] futex(0x564dfc36ba54, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=139049769}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 109] futex(0x564dfc36ba00, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable) [pid 109] futex(0x564dfc36ba50, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=143594294}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 109] futex(0x564dfc36ba54, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=145939364}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 109] futex(0x564dfc36ba54, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=148054691}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 141] futex(0x7f26b8002970, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable) [pid 141] futex(0x7f26b80029c0, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) [pid 109] futex(0x564dfc36ba50, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=164532211}, FUTEX_BITSET_MATCH_ANY) = -1 EAGAIN (Resource temporarily unavailable) [pid 109] futex(0x564dfc36ba00, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable) [pid 109] futex(0x564dfc36ba54, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=150743339}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 109] futex(0x564dfc36ba50, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=153067690}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 109] futex(0x564dfc36ba50, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=155201552}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 162] futex(0x7f2684004170, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable) [pid 109] futex(0x564dfc36ba54, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=157553274}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 109] futex(0x564dfc36ba54, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=159643594}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 109] futex(0x564dfc36ba50, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=161867045}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 109] futex(0x564dfc36ba50, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=163956012}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 109] futex(0x564dfc36ba54, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=166255927}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 109] futex(0x564dfc36ba50, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=168521447}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 109] futex(0x564dfc36ba50, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=170787468}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 138] futex(0x7f2710031600, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable) [pid 109] futex(0x564dfc36ba50, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=179280507}, FUTEX_BITSET_MATCH_ANY) = -1 EAGAIN (Resource temporarily unavailable) [pid 109] futex(0x564dfc36ba00, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable) [pid 109] futex(0x564dfc36ba54, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=173473801}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 109] futex(0x564dfc36ba54, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=175652607}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 109] futex(0x564dfc36ba50, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=177978832}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 109] futex(0x564dfc36ba50, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=180085563}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 162] futex(0x7f2684004170, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable) [pid 138] futex(0x7f2710031600, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable) [pid 109] futex(0x564dfc36ba50, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=196375489}, FUTEX_BITSET_MATCH_ANY) = -1 EAGAIN (Resource temporarily unavailable) [pid 109] futex(0x564dfc36ba54, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=182573021}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 141] futex(0x7f26b8002970, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable) [pid 109] futex(0x564dfc36ba50, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=184971732}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) [pid 109] futex(0x564dfc36ba50, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=768, tv_nsec=187152452}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)

I also ran an error summary with strace which reports the total number of errors per call after the program is terminated. Note the large number of recvmsg errors in this summary. Also 18000+ read errors:

adm0@steam0:~/bin/irony> strace -c --summary-columns=error ./IronyModManager strace: Process 289 detached errors syscall


      epoll_wait
63721 recvmsg
   45 lstat
18242 read
  381 futex
      mprotect
      write
  184 access
      madvise
      close
  778 openat
      writev
      mmap
      poll
      fstat
  125 stat
      brk
      munmapin the prior command line.
      ioctl
      getpid
      getrandom
   41 readlink
      ftruncate
      sysinfo
      membarrier
      fcntl
      clone
      pread64
      flock
      rt_sigprocmask
      getdents64
      lseek
    2 recvfrom
      geteuid
      sched_setscheduler
      fadvise64
      fstatfs
    6 rt_sigreturn
      sched_yield
      msync
    2 connect
      socket
      getuid
      uname
      prlimit64
      sched_getaffinity
      pipe2
      rt_sigaction
      sendto
      getgid
      getegid
      tgkill
      memfd_create
      pipe
      sendmsg
      bind
      listen
      getpeername
      execve
      getcwd
      chdir
    2 unlink
      fchmod
      getsid
      sigaltstack
      mknod
      statfs
    1 arch_prctl
      gettid
      set_tid_address
      epoll_ctl
      set_robust_list
      epoll_create1

83530 total
lxaphir commented 3 years ago

Nlog report -- the nlog.config that I have is identical to the one you posted, however I am getting no log files generated in ~/bin/irony (which I assume is the basedir if ironymodmanager is launched from there). If the logger name rules are set to "Error" as they seem to be, we should be getting something.

bcssov commented 3 years ago

It could still be anything from a managed C# code coming from either Irony or Avalonia; or an Avalonia wrapper code in C# around native C++ code.

Nlog report -- the nlog.config that I have is identical to the one you posted, however I am getting no log files generated in ~/bin/irony (which I assume is the basedir if ironymodmanager is launched from there). If the logger name rules are set to "Error" as they seem to be, we should be getting something.

No they are not the same. I linked the log levels Nlog configuration to you to study to see which error log level you want and a development version of Irony nlog conflg (which is different than the release version of Irony).

bcssov commented 3 years ago

How much RAM do you have exactly? On what kind of hardware is Irony running?

Had a friend with some spare time who quickly setup a clean bare bones build with latest updates applied with I believe OpenSUSE Tumbleweed + KDE Plasma in an effort to reproduce the problem. To see whether there's something in .NET Core that does not agree (or Avalonia) with this specific build. His quick test passed with no issues (though I guess that should not surprise you).

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

lxaphir commented 3 years ago

Hey bcssov memory is not the problem. The hardware is an Acer Swift 3 with AMD Ryzen 7 2700U and RX540 graphics. Your app is running by itself and consumes no more than 15% memory while running Conflict Solver. I am including a screenshot of top with the memory consumption:

https://imagebin.ca/v/5nDh5SFWKKsK

I'm running more diagnostics to trace library calls.

lxaphir commented 3 years ago

Btw thanks for the upgrade to the application. I'm running tests on that now. Also, thank you for the tests you and friend conducted on a fresh install of Tumbleweed; that effort is much appreciated.

I will figure this out. It might take a while but having you as occasional consult is invaluable.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

bcssov commented 3 years ago

I plan on converting this ticket into a discussion but cannot at the moment as that function is unavailable. It's some kind of GitHub bug so the feature was disabled at the moment.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

lxaphir commented 3 years ago

Thanks bcssov for trying to get this converted into a discussion, that would be helpful

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.