laurent22 / joplin

Joplin - the secure note taking and to-do app with synchronisation capabilities for Windows, macOS, Linux, Android and iOS.
https://joplinapp.org
Other
44.86k stars 4.87k forks source link

Desktop: Joplin Freezing During Syncing and Decrypting On Linux Kernel 5.5+ #2518

Closed dimyself closed 4 years ago

dimyself commented 4 years ago

I just started using Joplin recently, and since first using it's been locking up/freezing/unresponsive. It seems to be getting stuck in a syncing loop. It will try to sync, and something is not allowing sync to stop/cancel. Even if I manually click on cancel when it's syncing, it just says "cancelling" and stays spinning but won't stop.

At this point, when I click on any notebooks/notes, nothing happens. The notes don't load, the screen doesn't refresh to the note I click on. The only thing that works at this point is I can click and open menus/settings.

I then have to kill the app and relaunch it.

Is anyone else having issues on Linux with Joplin being buggy and freezing?? I'd really like to resolve this so I can use Joplin! I'm not sure if this is something on my system, or if others on Linux are having this issue?? It's pretty much unusable for me at this point! The bug also happens even if I don't click on Sync. I will come back to Joplin to make a new note, and it will be in this stuck sync state on it's own without me doing anything.

It is a major bug on my system. It happens frequently and I can reproduce it easily.

Environment

Joplin version: Joplin 1.0.179 (prod, linux); Sync Version: 1; Revision: b4e325d (master) Platform: Arch Linux OS specifcs:

Steps To Reproduce

  1. I can launch Joplin, and to reproduce, I simply click on Sync in the lower left corner a few times, and it gets stuck in the sync loop error/bug.
  2. It happens every time. I have to kill/relaunch at this point.

Describe what you expected to happen:

Logfile

Console shows normal activity before the problem when clicking on a different note: webview_domReady Connect {props: {…}, context: {…}, refs: {…}, updater: {…}, version:

Then when I initiate the bug by clicking on sync several times, nothing shows up in console whatsoever. It only has the last reported event from before the bug.

Here is my updated log.txt file: https://pastebin.com/CDdhuL25

Whenever I initially launch joplin in debug, I get these messages in the console (in case they’re important) console.log file : https://pastebin.com/zzjdguTX

bedwardly-down commented 4 years ago

This issue seemed to crop up immediately after upgrading to Linux Kernel 5.5 in Artix Linux (an Arch offshoot). Since you, me and another person are all having similar issues on Arch or Arch based systems, I've decided to do an experiment and see if downgrading to the Linux LTS kernel (5.4.19) might solve the issue since I was also noticing some network issues along with other things since upgrading.

dimyself commented 4 years ago

Here's a video showing the problem if it helps!

https://youtu.be/sdpI4kBIaUY

dimyself commented 4 years ago

This issue seemed to crop up immediately after upgrading to Linux Kernel 5.5 in Artix Linux (an Arch offshoot). Since you, me and another person are all having similar issues on Arch or Arch based systems, I've decided to do an experiment and see if downgrading to the Linux LTS kernel (5.4.19) might solve the issue since I was also noticing some network issues along with other things since upgrading.

@bedwardly-down does the video look similar to your issue?

Did the LTS kernel change anything?

bedwardly-down commented 4 years ago

@dimyself the video looks exactly like what I'm experiencing (minus that console log output). I have debugging turned on and have a Kitty Terminal running with tail -f $XDG_CONFIG_HOME/joplin-desktop/log.txt running on it just to the left of the bottom part of the console log. Other than that, definitely the same freezing issue shown.

2020-02-08-193241_1920x1080_scrot

To know if it's a kernel issue, I'd need to use Joplin extensively on it for a few days since the issue wasn't frequent enough for me to really test it out but enough to be a bit annoying. Ha

Also, my screenshot was from a failed attempt at getting a task done here that involved loading icons in the sidebar for all Notebooks. I got very close to finishing it, but couldn't get it to production level without spending a massive amount more of resources that were wearing thin. Ha

Soltares commented 4 years ago

I thought it was just me at first experiencing this issue, so I was hesitant to file a bug however the video and description match what I see as well on Arch 5.5.2-arch1-1 / Joplin 1.0.179-1. I am using a filesystem sync target on a remote system via SMB. I switched to NFSv4, but there was no noticeable impact.

My Windows 10 instance of Joplin is working as expected on the same filesystem target, yet it's lagging behind a bit at version 1.0.175. I've been nervous to upgrade as I use it frequently during work. (Thanks by the way for such a useful tool!!)

Note: 479/479 Folder: 19/19 Resource: 158/158 Tag: 0/0 NoteTag: 0/0 MasterKey: 0/0 Revision: 366/366 Total: 1022/1022

Here is a short clip of what I see. Not much different from the youtube video other than I generally see the issue during my second sync and the Synchronisation Status is blank during the sync.

Joplin-Sync-Stuck-2020-02-18-min

hpfmn commented 4 years ago

I have the same issue also with the newer 1.0.184 version of joplin. I'm not sure if it only happens during sync though. But seems like it is more likely during sync.

bedwardly-down commented 4 years ago

I know that 1.0.184 is definitely not recommended for daily use and has quite a few (albeit small) areas where it can and has broken for me, but are you also on an Arch-based distro, @hpfmn ? My main view is that we all have that one factor in common which means that we all share the same kernel which may not load the same network drivers but more than likely would share the same protocols and whatnot. Ha

hpfmn commented 4 years ago

@bedwardly-down yes I'm also running arch. Is there any information which versions are considered stable and which are considered unstable? I'm 1.0.179 has a text on the github release - is that what is saying that it is stable? I honestly don't think it is related network drivers ;)

hpfmn commented 4 years ago

What is maybe different for me - if I wait a long time (like several minutes) it does recover and starts to be usable again. But some of the changes I made to the current document are discarded.

bedwardly-down commented 4 years ago

I don't think it's network drivers either. I was referring to the fact that none of us would be running the same network drivers but all would have similar if not the same Network Protocol implementations built into the kernel. When i say Network protocols, i mean how the kernel handles things like https and interfaces with each of the various network drivers to allow them to function. If there was a major change there, it could affect how Joplin handles syncing and could be an upstream bug for whichever module is used to handle it. I wish some Ubuntu / Debian and Fedora users would pipe in so that the issue can be classified as a Linux wide one and not Arch specific. Ha

Also, the stable ones are linked in the forums and have a Green Release tag on github. All other releases are draft ones not meant to be used as a daily driver but instead a bleeding edge test (minus the Red Pre-release ones; those are for testing but are meant to be a stop gap between main stable releases)

BlissfulTarpon commented 4 years ago

Just wanted to let you know I am experiencing the same thing (#2507). I thought it had something to do with syncthing at first but I can see from reading the thread that you're all using different sync protocols. This is making it really hard to work for a long time because you never know what the program will discard. Reliability is zero at the moment.

I disabled syncthing on both computers and the problem persists.

bedwardly-down commented 4 years ago

Thanks @matcharles. I use Joplin for my budgetting and daily journaling and can definitely say that this issue has caused some similar headaches for me. Luckily, it's not been fatal for my uses but i can definitely see it being terrible.

I'm definitely beginning to think this is an arch / kernel specific bug. Most distros won't be running 5.5 yet unless the user explicitly installs it themselves or it offers some drastically needed features that something like Ubuntu would jump right on board with. @laurent22 , what are the modules you use for syncing in Joplin? I'd like to see if my hunch or something similar is correct.

laurent22 commented 4 years ago

It's mostly node-fetch and the sqlite3 that would be involved for syncing, but it could also be due to some complicated interaction between Electron and those modules.

If sync status is blank in particular, it could mean that the sqlite database is locked or the app can't read from it for some reason. While it's in this state, are you able to open database.sqlite (with an sqlite browser) in your profile directory?

Soltares commented 4 years ago

The joplin-desktop database.sqlite appears readable while sync is spinning and sync status is blank:

[chris@spire ~]$ sqlite3 ~/.config/joplin-desktop/database.sqlite
SQLite version 3.31.1 2020-01-27 19:55:54
Enter ".help" for usage hints.
sqlite> .tables
alarms                 notes                  resources_to_download
deleted_items          notes_fts              revisions            
folders                notes_fts_docsize      settings             
item_changes           notes_fts_segdir       sync_items           
key_values             notes_fts_segments     table_fields         
master_keys            notes_fts_stat         tags                 
migrations             notes_normalized       tags_with_note_count 
note_resources         resource_local_states  version              
note_tags              resources            
sqlite> select count(*) from notes;
479
bedwardly-down commented 4 years ago

Looks like node-fetch has quite a few open http protocol issues that may be similar to what's happening here: https://github.com/node-fetch/node-fetch/issues

I'm out working right now but my brief 5.4 kernel tests weren't showing any issues so far, but they could still pop up later. If anyone else wants to test that theory too, all you'll need to do is install the linux-lts and linux-lts-headers packages and then reconfigure whatever boot loader you're using to boot it and you should be good to go

bedwardly-down commented 4 years ago

My previous issue could be related too https://github.com/laurent22/joplin/issues/2490

Soltares commented 4 years ago

I'm technically on the clock too, but use Joplin to work effectively so.. :) I took the plunge and switched kernels from 5.5.2-arch1-1 to 5.4.20-1-lts. I am immediately seeing an improvement. What only took 2-3 sync attempts to replicate the issue is now going on at least a dozen syncs with no sign of hanging! Looks like you're definitely on to something with the kernel versions.

bedwardly-down commented 4 years ago

@Soltares i use Joplin mobile during my work since my work is constantly on the move, so i can't test in the field. Thanks for trying it out too and glad it's working for you too. ☺️

bedwardly-down commented 4 years ago

During my test, I synced around 20 times with several test files i created, synced and deleted and had no signs of it either.

bedwardly-down commented 4 years ago

@Soltares @laurent22 would either of you know how to test this better to try to get a bug report sent upstream to node-fetch since that's looking like it may be where the issue lies since this is looking less like a Joplin issue?

bedwardly-down commented 4 years ago

I also found a post on Reddit where Firefox is exhibiting similar freezing issues on Arch with 5.5.4: https://reddit.com/r/archlinux/comments/f5smcx/intermittent_hanging_with_554_kernel_x1c6_intel/

bedwardly-down commented 4 years ago

@dimyself, since you're the one that opened this bug report, I just wanted to inform you that Joplin is still syncing perfectly with no issues whatsoever on the LTS kernel after me leaving my laptop on all day long letting it auto sync while I was out and about. It looks like this is most likely the current solution until the devs upstream get the issue resolved.

yewwayne commented 4 years ago

Having the same issue but I doubt it's network related as I'm only syncing to a local directory and letting Syncthing handle cross-device sync. The issue persists after disabling syncthing.

I'm also thinking it's not specific to the sync function of Joplin. Here's what just happened to me:

  1. Create a new note, then modify it and sync in a cycle with 5-10 second delay between each round.
  2. After 20 rounds or so with some occasional app switching, Joplin UI became unresponsive as in the youtube video, except no sync process was ongoing from what I could tell.
  3. After 2 more minutes the UI suddenly "played back" my mouse clicks in quick succession and becomes responsive again.
  4. Clicked sync, sync process seemed stuck. After a minute the sync finishes. For some reason it reports "updated 4 remote items" even though i've only been modifying one note.
  5. Joplin seems to work fine again.

log.txt of the above: https://pastebin.com/ma7wK4Sr. I noticed 2 instances where SearchEngine: Updated FTS table took over a minute:

2020-02-19 16:28:13: "SearchEngine: Updated FTS table in 83816ms. Inserted: 9. Deleted: 0"
2020-02-19 16:30:22: "SearchEngine: Updated FTS table in 63586ms. Inserted: 1. Deleted: 0"

Kernel: 5.5.4-arch1-1 Joplin version: 1.0.179 (prod, linux); Sync Version: 1; Revision: 66356d8

bedwardly-down commented 4 years ago

@yewwayne thanks for bringing that up. You're not the only one that said the issue showed up with Filesystem. I think someone else brought that up above. Joplin doesn't technically save locally, so it treats Filesystem sync as syncing to any cloud platform, I'm suspecting the url is just replaced by a local directory string and is just getting checked if it exists or not.

If that's the case, the bug is still a network related bug since the module involved with syncing (or in this case a hacky way of saving but not saving locally) would still be acting up. Does that make sense?

hpfmn commented 4 years ago

I could catch it today in development tools in the debugger I hit the pause button and seems to be hanging at this code: https://github.com/laurent22/joplin/blob/e7a56bb2b1df3f6d90e196c406ed20620684db86/ReactNativeClient/lib/models/ItemChange.js#L41

Alas I coudln't get the backtrace... because the whole electron thing froze...

bedwardly-down commented 4 years ago

Hmmm. Because that's in ReactNativeClient/lib, if that was the source of the bug and not just a symptom, changing the kernel version wouldn't solve the problem. Everything in that folder is used by all platforms which would mean the bug would occur everywhere as frequently.

Still a nice find, though. :smiley_cat:

hpfmn commented 4 years ago

I don't know if it might be related in any way to this? https://github.com/electron/electron/issues/21415

bedwardly-down commented 4 years ago

I think the bug you linked could be related to some other issues here but probably not this specific one. The person that opened that bug report up was on 5.4 and Slackware, so not the same as the rest of us, but still could lead to more info.

I'm still thinking this is an upstream bug related to node-fetch and not a Joplin one, so if more people could test out the 5.4 kernel suggestion I made, that would definitely help make sure that's a legitimate fix. Thanks

bedwardly-down commented 4 years ago

@hpfmn, I do think you're on to something though. I got a hit here from the dev of a Protonmail frontend that is experiencing similar issues to this one: https://github.com/electron/electron/issues/21415#issuecomment-589242637

@laurent22, if I'm reading the linked issues correctly, it looks like this is an Electron 7 bug that was fixed in 8.

BlissfulTarpon commented 4 years ago

I can't get LTS on this system because I need patches from 5.5, but I just wanted to let you know the situation has become worse today. Before I had a small window where I could enter some data and then save. Now not only do I not have the time to cut and paste, sometimes I can't browse my notes. Totally unusable. I understand from the thread that this isn't Joplin's fault. Just wanted to chime in.

EDIT: @bedwardly-down just upgraded to electron 8, will see if it makes a difference on 5.5

EDIT2: Same thing happens on electron 8.0.1 unfortunately. DeepinScreenshot_select-area_20200220185945

laurent22 commented 4 years ago

If Electron 8 fixes it we can upgrade but doesn't look like it does then? How about that disable-gpu flag they mention in the other thread?

bedwardly-down commented 4 years ago

Thanks for checking on that, @matcharles . I was afraid some people wouldn't be able to use LTS for that reason but at least we're getting somewhere with this, right? Also, could you provide a step by step on how you upgraded to Electron 8 so I could test it out too to see if the issue shows up on the LTS kernel for me? That way, if the project does move to it, we can make sure that kernel version differences won't affect things too much. Thanks. :D

BlissfulTarpon commented 4 years ago

@bedwardly-down I just sudo pacman -S electron I presume it just got added to the repos tonight!

bedwardly-down commented 4 years ago

Of course, @laurent22, if all else fails, are there any alternative modules you would be interested in testing out for the sake of seeing if maybe we can future proof Joplin against further problems caused by how fragmented the node repo is?

bedwardly-down commented 4 years ago

@matcharles, the problem I see with that is it wouldn't affect this project since electron is pulled in as a node module for this project only. I would say that your Electron 8 test shouldn't be accepted for that reason. Electron has been in the Arch repos for a good while now so Electron 8 still may be a possible fix if it doesn't break modules or any of the code here. That's my biggest concern with that, since with other projects, major version releases have a tendency to offer API breaking changes.

BlissfulTarpon commented 4 years ago

No worries! You guys are all way more knowledgeable with this than me, but I did update from 7.1.11-1 -> 8.0.1-1 tonight so I thought it might have something to do with this. Sorry I couldn't be of much help haha!

bedwardly-down commented 4 years ago

No, you were definitely helpful by just attempting it and being forward with what steps you took. By acknowledging that you updated Electron globally but not locally to the project, we can still look at upgrading to 8 as a possible fix for this (and possibly other issues that may arise and cause Joplin resources to be wasted).

hpfmn commented 4 years ago

I build a package with electron 8.0.1 - you can download it here https://johanneswegener.de/joplin-1.0.179-1-x86_64.pkg.tar.xz

But I needed to update some other libs as well so be cautious if something doesn't work

EDIT: After you downloaded it, you can install it with pacman -U joplin-1.0.179-1-x86_64.pkg.tar.xz while being in the same directory as where you downloaded it.

bedwardly-down commented 4 years ago

I'm not sure if builds like that should be used for testing purposes here. I know that @laurent22 has already said when I built my Debian release for Joplin that the project can't officially support anything but AppImage for Linux builds. I think that, in that context, making a fork that is specific to upgrades and letting testers build it from source with instructions and specific changes documented would probably be the better option, @hpfmn . Thanks creating a pacman package, though.

Also, another concern with doing builds like that: depending on how you built it and what libraries you added, it could possibly break tester's systems if it ends up pulling in extra libraries or forces an upgrade on ones already available that aren't supported by other packages. That's a huge part of why I have stopped using AUR except when absolutely necessary.

hpfmn commented 4 years ago

@bedwardly-down yes I know it is not optimal I just think that if @matcharles can easily reproduce the behavior he is the best person to test it. And this is just a quick and dirty solution. My npm/node/electron knowledge is also quite limited ;)

bedwardly-down commented 4 years ago

@matcharles , so your test can be clean, I'm currently getting a stable branch ready with Electron version 8.0.1 for you to test and I will be testing on 5.4 . I did have to add the node-abi-2.15.0 module to get Electron 8 to work, so that could break some things since I'm not sure what the minimum version needs to be to run it yet. This branch will also be testable for others too.

BlissfulTarpon commented 4 years ago

@bedwardly-down Looking forward to testing.

bedwardly-down commented 4 years ago

For anyone that wants to test Electron 8, here you go: https://github.com/bedwardly-down/joplin/tree/bug-tracker-2518

I would highly recommend backing up $XDG_CONFIG_HOME/joplin-desktop folder and exporting a Jex file of your current notebooks because this could possibly break some things.

For the test steps:

  1. Create a new Notebook called Test
  2. Create a new note
  3. Synchronize
  4. Delete note
  5. Synchronize
  6. Create new Todo
  7. Synchronize

Repeat these steps multiple times increasing the number of notes/todos by 1 each time until the bug shows up. If after four times of going through these steps the bug still hasn't shown, I say that's a good step in the right direction.

I've got OBS installed and am using screen capture, so I can easily record my tests but any screen recording software should be fine. Having live test results would make these tests more valid.

Here's a screenshot showing Joplin running using that branch

2020-02-20-200041_1920x1080_scrot

bedwardly-down commented 4 years ago

Here's my Electron 8 test and creating new Todos seems to break the UI and cause the Sidebar to shift to the left. @matcharles and anyone else testing tell me if you run into the same issue along with any other ones you find. Also, at the beginning of the recording, I show my kernel version and go through the entire build process.

https://youtu.be/arafnpZETHo

log.txt

BlissfulTarpon commented 4 years ago

Thanks @bedwardly-down . Now for a totally noob question, do I just git clone and then run the install script from your fork?

m-angelov commented 4 years ago

I have the same issue with Joplin (as @Soltares I was a bit hesitant to file the bug, but I'm a bit relieved that it's not just my system).

@bedwardly-down I'm going to test your version and give feedback. @matcharles I've skimmed the install script (Joplin_install_and_update.sh) and it seems that it just sets up the environment and gets the latest AppImage. In the video that @bedwardly-down has posted in the comment you've replied to there's a video where you can see the build process.

bedwardly-down commented 4 years ago

Thanks @bedwardly-down . Now for a totally noob question, do I just git clone and then run the install script from your fork?

Just clone the repo in a directory where it won't affect anything else and follow the instructions in BUILD.md in the root directory. If that's what you mean by the build script, definitely.

bedwardly-down commented 4 years ago

@m-angelov thanks for being more detailed than me on that. The install script is not how anyone should build and install this. It's used for upgrading to the latest Joplin version and overriding the installed one, which is NOT what you want to do with a test build like this. You want this build to be ran separate from the main one you use as your daily driver so you can revert back to that version without harming anything.

Please do give feedback, though, and if you can provide either some screenshots or a brief video showing the whole testing process (like my not so brief one), that would definitely be useful for others.

m-angelov commented 4 years ago

@bedwardly-down at the moment I can't do extensive testing, but here are some initial observations:

bedwardly-down commented 4 years ago

@m-angelov that's good enough for me as long as at least one other person here can validate it. It's possible that my issue could have just been a fluke that requires a full clean and rebuild, but when you are able to (I'm getting ready to go to work myself), if you can screenshot the Note Title bar and the Joplin version at least, that would definitely help.

If you could also enable debug information, adding the log.txt (like I did) would allow anyone to see what's happening there. Also, what kernel are you running? I'm on the LTS one but am going to switch back to 5.5 since that's the one that mat and others are running.

https://joplinapp.org/debugging/

EDIT: The shifting seems to only happen when debugging is enabled, which is a really strange bug. When disabled, everything works like it should (during my quick test, that is)

Debugging on: 2020-02-21-075656_1920x1080_scrot

Debugging off: 2020-02-21-075817_1920x1080_scrot