Roam-Research / issues

Roam Research - A note-taking tool for networked thought.
https://roamresearch.com/
303 stars 7 forks source link

Borked Daily Note locking browser UI and database loading #47

Closed cori closed 4 years ago

cori commented 4 years ago

I'm not sure how to title this, but this is a bug report to capture the details I sent to the #bug-reports slack channel

Here's the report portion of the thread:

cori  23 hours ago
Like others did, I saw some really odd behavior yesterday. I never got to the bottom of it and was’t able to clarify “root” causes, but just to get a snapshot in case there are patterns that can be mined, here’s a thread :point_right::skin-tone-2:

cori  23 hours ago
i was working happily on one device (macOs using safari) getting my daily note started. things were behaving fine.

cori  23 hours ago
I moved to my other device (another macOs) and could not get my database to load. At all.

cori  23 hours ago
I tried numerous combinations of Safari, Chrome, Firefox at different times throughout the day, both in normal windows (i.e. polluted by other extensions) and incognito windows (i.e. no extensions). No luck.

cori  23 hours ago
all the while, my database continued to work fine on the original device, even after several reloads. no problems loading it or working in it.

cori  23 hours ago
many times when trying to load the db on the broken device it would freeze the entire browser ui - couldn’t open a dev tools window on that tab (although other tabs worked fine)

cori  23 hours ago
this persisted across browser and os-level restarts

cori  23 hours ago
another database on that account loaded fine

cori  23 hours ago
i eventually went to the working device, exported my database, created a new account and database on the “broken” machine and imported the database there. all worked fine.

cori  23 hours ago
this morning went back to my original database on both machines and it looked just fine - was able to work in my daily note and everything. (edited) 

cori  23 hours ago
then on my “broken” device I scrolled down on the daily notes page and when Roam tried to load yesterday’s daily note the browser UI (chrome this time, with no other windows open to the db anywhere) locked up again - similarly - not able to scroll or interact with Roam or that tab at all, although other unrelated tabs worked fine

cori  23 hours ago
finally reloaded roam and went to All Pages and deleted yesterday’s daily note and everything seems fine again

cori  23 hours ago
sorry for the long missive, I hope there might be some clue buried in there somewhere

TL;DR (with some inferences)

My April 27th, 2020 Daily Note became corrupted and would not load on any browser (tried Safari, Chrome, Firefox, on both regular and incognito windows), but only on one device. On my working device it continued to work as expected in Safari. I was able to export my whole db from the "working" device and import it into a new account / database on the "broken" device and everything was fine but in my standard account / database it continued to be broken all that day - and by broken it meant that the database wouldn't even load - it just kept spinning on the astrolabe. On the following day my default database loaded fine, but when I scrolled down to where Roam tried to load the previous day's note the browser locked up again. Deleting that note from the All Pages page and then recreating it from the other database allowed me to use the note and the default database.

To Reproduce

No repro that I can find

System Information (always working device):

System Information ("broken" device):

Additional context

I can send the export I used to recreate a working database from the working device, or the content of that day's note's json, and I have an export from the "broken" device from that day, but I'm unsure if it contains the borked node or not. I don't think there's anything super-private in there but I'd still rather not post it publicly.

cori commented 4 years ago

(actually I don't think I have an export from the broken device)

JasonBenn commented 4 years ago

I had the issue with a non-Daily Note page, fortunately. Copy-pasting from #bug-reports in the Slack.


I believe I have a single corrupted page. Navigating to it hangs Chrome - it just loads forever and Chrome becomes unresponsive until I quit the app. No errors in the console. My database is 1MB and this page is about 4000 words with no screenshots. I was editing it from 3 devices. Fortunately, I've been making regular backups: https://github.com/JasonBenn/roam-backup/actions. What's the best way to restore my DB from a .md dump?

jason 2 days ago Hm. The file doesn't exist in my latest backup. Roam, when exporting all files, skipped the corrupted one.

jason 2 days ago By binary searching through my backups, I can determine the file was corrupted between Saturday afternoon at 3:46pm and Sunday morning at 5am.

jason 2 days ago Tried importing the backup file into my current DB. Didn't fix the issue, as I predicted - navigating to it afterwards still hangs my browser.

jason 2 days ago Checked {{orphans}}: found a few blocks, but not from the corrupted page. Also sent to the channel

jason 2 days ago My workaround, for now: find the latest uncorrupted version of the file, import it with a new name, and carefully avoid the landmine that is the old file. I can still search/find/update all my old links and redirect them to the new file. Would love a better fix for this. Highly recommend forking signalnerve/roam-backup and setting up a backup workflow for yourself - though beware, the .json backup doesn't work reliably for even medium-sized databases and my .md backup files are full of errors after reimporting. ¯_(ツ)_/¯

floriancargoet commented 4 years ago

I had this exact issue with today's note. I noticed an empty line that should not have been there, tried deleting it and noticed the app was frozen. I couldn't load my db on any browser since the corrupted note was the first thing Roam tried to display (today's note).

I was able to recover by going directly to the "all pages" URL in a new tab, bypassing today's note. From there, I could delete the corrupted note.

Before doing that, I did a JSON export of the whole DB and I inspected it. There's an empty node right where was the empty line I tried to delete.

// here's the parent of the empty node
          {
            "string": "some text",
            "create-email": "florian.cargoet@gmail.com",
            "create-time": 1588246072700,
            "children": [
              {}, // <-- here's the culprit
              {
                "string": "{{[[TODO]]}} some todo",
                "create-email": "florian.cargoet@gmail.com",
                "create-time": 1588246072700,
                "uid": "ExFnSVbDd",
                "edit-time": 1588246072724,
                "edit-email": "florian.cargoet@gmail.com"
              }
            ],
            "uid": "2KlXY1ZvT",
            "edit-time": 1588246072724,
            "edit-email": "florian.cargoet@gmail.com"
          }

These nodes were the result of a copy-paste so I thought the empty node came from the pasted text but I can't find anything weird there (I don't have the exact text but it's text I generate myself from a script and it has always worked before).

cori commented 4 years ago

I'm having another issue with disparate behavior between my two MacBooks, although I'm not certain it's related to the same problem.

Started my day on my 13", with my 15" asleep upstairs. Did my normal Roam Morning Things on there.

Moved upstairs to my 15" after putting my 13" to sleep. Open Daily Notes and my notes appeared to be there. Ran into an issue with copying a block ref in Safari, so closed that tab and opened Daily Notes in Chrome to verify if the issue was Safari-only (I did this at around 1245UTC just before I posted https://roamresearch.slack.com/archives/CN2L1UUHY/p1588337151430400). Verified the issue, took care of the task I wanted to do that was easier in Chrome, closed my Chrome Roam tab and went back to Safari.

Loaded up my Daily Notes and today's note was empty, and also showed a "future task" for today from a different page as not having been moved to today although I'd moved it to today on my other device (this is that block: https://roamresearch.com/#/app/cori/page/_zNlQQVBZ). Hard-refreshed and the content came back, but on a subsequent load it was empty again.

Closed my tab, went back to my 13" and the content was there still. Put it back to sleep and came back to my 15" which was still blank, and remains so after several refreshes.

I have exports from both machines of today's note - they're not overly private but I'd still rather not upload them in a public issue; let me know if you'd like to see them, but on a cursory glance they don't seem useful. Also, aside from the brief foray into Chrome to debug a copy block ref issue this was all in Safari.

When attempting to import the file from the 13" into the 15" Roma tab I see the following: image

I see the same thing both before and after deleting the May 1st, 2020 page on the 15" and with both json and md files. If you'd prefer me to open a new issue for that I can do so.

cori commented 4 years ago

Ok that import error did not occur in Chrome - I'll test some more before creating an issue for that though. Importing json didn't work - "blocks already exist".

filipesilva commented 4 years ago

I don't have a resolution yet, but want to mention the workaround in https://github.com/Roam-Research/issues/issues/19#issuecomment-629191389

When this happens on the daily log, the user won't be able to load their roam at all. I think the same also happens with queries.

They'll have to use https://roamresearch.com/#/app/DATABASE_NAME_HERE/search to delete the problem page to load again.

floriancargoet commented 4 years ago

@filipesilva This is what I did in https://github.com/Roam-Research/issues/issues/47#issuecomment-621914794 but before deleting the problematic page, I did a JSON export so that I could fix the corrupted page manually. I think it's worth mentioning that one can restore some data from the corrupted page.

wismie commented 4 years ago

So ok joining the crowd here (I am Catherine on slack), I think my issue is very similar. I got two daily pages "corrupted", basically last Tuesday and Today. I noticed that I had a block on last week's page that basically make it stop working entirely. I could not delete (or open) the faulty block and could not load the page. The same happened today on my daily page, but I could not identify a faulty block there (except a reference to last week's page). It is not impossible that the initial corrupted block was self-referencing, but I cannot vouch on that because I cannot access it. On the page there were embeds, blockrefs and queries.

Of course I didn't had a backup (lesson learned) and couldn't find a way to delete the faulty block. Only way was to delete the two pages... and loose the info I got there.

I can also say that between two machines, the green light for synch was OK, but the content was not the same. I was not editing on both at the same time, only had the app open on both.

Edit: I am on Windows 10 and using Chrome exclusively.

filipesilva commented 4 years ago

Heya all,

I've been digging into this problem the last few days and so far I've only found a problem with pasting that sometimes left a bullet that couldn't be interacted with, and a couple of other operations that could cause invalid transactions. But neither caused Roam to freeze. Meanwhile https://github.com/Roam-Research/issues/issues/206 was also fixes recently (just mentioning because @cori also saw this problem).

If someone runs into this again, can you let me know before deleting the problematic page? On the reports I had, the page had already been deleted so I could not see Roam freezing on live, and could not reproduce with the partial data I had.

filipesilva commented 4 years ago

Heya all,

Was able to find a live case from the support tickets and see exactly what went wrong. Through a bug on our block operations, a block ended up a child of itself and would be stuck on an infinite render loop. When that page was exported to JSON, the block appeared as {} instead.

I added a fix where rendering stops when it detects recursion and shows the bad block as bright red instead. It should be arriving to clients on the next time they update. This isn't a full fix because it doesn't address the cause of the infinite render, it just stops it. Page deletion is still needed to get rid of that block, but at least the graph won't be inaccessible.

Also added a fix for importing JSON exports with the empty blocks, where the import just ignores it instead of failing. I think we'll have to make some kind of health check in the future where we detect cases like this, fix their causes, and fix existing graphs that have those bad data structures.

Thanks for the patience all!

pmbauer commented 4 years ago

thanks so much!