Roam-Research / issues

Roam Research - A note-taking tool for networked thought.
https://roamresearch.com/
303 stars 7 forks source link

Bullet deleted while I was entering it, and other intermittent data loss #49

Open tclaiborne opened 4 years ago

tclaiborne commented 4 years ago

I was entering a bullet in my daily notes and while I was typing it, I watched as the entire line was deleted.

Text was:

Need an [[mvp]] for [[Device Foo]]

Both mvp and Device Foo were new pages. Foo is not the actual word used.

As I finished typing Foo, the word disappeared and the text looked like

Need an [[mvp]]

And then a second later, the entire bullet was gone and I was left on an empty bullet.

It looks like the cloud overwrote my local content. The deletions were entire pieces of text not character by character.

I tried to reproduce, but have not seen it happen again, yet.

System info Google Chrome | 83.0.4103.21 (Official Build) beta (64-bit) OS: Chrome OS

mmower commented 4 years ago

I have seen a similar bug several times. For me it's that bullets I was working on the content disappear.

Today I had written one bullet with one child. A few minutes later all I had were two empty bullets.

MacBook Pro 15,3 macOS 10.14.6 Chrome: 81.0

ghabs commented 4 years ago

I have also encountered this bug, where either I am typing a new bullet point and the previous one disappears or I wrote a bullet point, switch to a non-Roam tab, and then return to Roam, and the content in the bullet point has either disappeared already or I watch it disappear. Here's error log from console.

sentry.js:2 ERROR [relemma.fire.link.core:252] - [DEFAULT] - error while applying confirmed tx #error {:message "Nothing found for entity id [:block/uid \"NouH9gUQi\"]", :data {:error :entity-id/missing, :entity-id [:block/uid "NouH9gUQi"]}} Chrome: 81.0

ghabs commented 4 years ago

Happened again, same error message (definitely related as this time two bullets disappeared, and two errors)

sentry.js:2 ERROR [relemma.fire.link.core:252] - [DEFAULT] - error while applying confirmed tx #error {:message "Nothing found for entity id [:block/uid \"NouH9gUQi\"]", :data {:error :entity-id/missing, :entity-id [:block/uid "NouH9gUQi"]}}

(anonymous) @ sentry.js:2 2sentry.js:2 ERROR [relemma.fire.link.core:252] - [DEFAULT] - error while applying confirmed tx #error {:message "Nothing found for entity id [:block/uid \"Rk8LeqeN4\"]", :data {:error :entity-id/missing, :entity-id [:block/uid "Rk8LeqeN4"]}}

filipesilva commented 4 years ago

Heya all, do you remember if the bullets that disappeared did so right after you alt-tabbed or switched tabs to roam? For instance, you were doing something in another window, then alt tabbed to the roam page, input something, and the bullet dissapeared?

mmower commented 4 years ago

Still seeing the "disappearing bullet contents" problem again today. Wrote a bullet, added a child and wrote the child. A few moments later the contents of the parent had vanished leaving an empty bullet. I can't remember whether I tabbed away to another app but it is likely.

ghabs commented 4 years ago

Heya all, do you remember if the bullets that disappeared did so right after you alt-tabbed or switched tabs to roam? For instance, you were doing something in another window, then alt tabbed to the roam page, input something, and the bullet dissapeared?

This was the case the first time - I alt tabbed back to my Roam tab and the previous bullet disappeared, however the second time it did not

ghabs commented 4 years ago

Multiple bullet points disappeared at some point today from my daily notes page.

This significantly impacts the usability of Roam/my confidence in the tool as these notes were old enough/I didn't notice at the time and now I no longer know what those bullets said.

filipesilva commented 4 years ago

@ghabs can you shoot me a message on slack with your db name, time interval for the disappearance, and any details you can tell me about the things that dissapeared?

filipesilva commented 4 years ago

Hey all,

We've identified a particular case where transactions (the way we save what you do on Roam) could get out of order if done around the time when the server connection switched between online and offline.

This online/offline switch most commonly happened when alt-tabbing, or when the load on the server was high enough that it disconnected you and then connected back again. It could also happen if your internet connection was intermittent.

We've now pushed a fix for this case. You can get this new version now if you force refresh the webpage. On windows chrome, you can force refresh by pressing ctrl+f5. on mac it's ctrl+shift+r. Please let me know if you still still this problem after force refreshing.

JMHendon commented 4 years ago

Just happened again to me (after force refreshing). Been happening for about 2 days now. Generally, I'll type 3-4 bullets, and then the first 1-2 bullets will disappear. Happens on both my Windows PC and my Mac laptop.

EDIT: I've refreshed many times on both devices, and it's still happening continuously.

filipesilva commented 4 years ago

@JMHendon can you ping me on the slack with your database name, the rough time frame when it happened today, and some of the content of the stuff that disappeared please? I'd like to look into it further.

JMHendon commented 4 years ago

Definitely, although I can't seem to find the slack invite. Can you point me in the right direction?

filipesilva commented 4 years ago

@JMHendon you can find a link to it in here: https://github.com/Roam-Research/issues#other-resources, the "Other Resources" section.

JMHendon commented 4 years ago

Yeah - that one seems to be expired.

filipesilva commented 4 years ago

@JMHendon try this one: https://join.slack.com/t/roamresearch/shared_invite/zt-e2wfa25e-MNVKIcKm1ng63VrrwQ14Dg

I'll update the readme as well, thanks for the heads up!

filipesilva commented 4 years ago

Heya all,

Still investigating overall data loss issues. We haven't been able to reproduce them in isolation at all which makes it very hard to dissect and fix the underlying issue. We have tests that stress the system by throwing a lot of changes at it and checking if the databases still are consistent and lost no data, but those never ran into the data loss.

While looking for a consistent reproduction we decided to leverage the existing reports on live. I asked a lot of you for details on data loss incidents and from that extrapolated a way to detect them.

Early this week we pushed code live that used the detection logic to report them automatically to our analytics platform. From those automatic reports we were able to select good samples of odd behaviour in our synchronizing logic that led to changes being synchronized out of order. This class of bug is known as a "race condition".

Even with those samples we weren't able to create an isolated reproduction. But they did lead to an hypothesis where a bit of code could run in different orders depending on the processor availability of a client. This hypothesis showed promise because it'd explain why it might not happen in isolation, but would be more likely to happen in a live environment that was dependent on the processing characteristics of the user device.

We pushed a change that should eliminate this bug vector in the 0.6.2 update as well, and are monitoring the live data for changes. If this was the cause, we should see reduction in the analytics events.

TLDR:. think/hope this might be addressed, force refresh to get newest version, keep me posted please.

filipesilva commented 4 years ago

Hey all, have you seen this sort of stuff happening in the last few days since my last comment?

JMHendon commented 4 years ago

I have not.

On May 19, 2020, at 2:42 AM, Filipe Silva notifications@github.com wrote:

Hey all, have you seen this sort of stuff happening in the last few days since my last comment?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Roam-Research/issues/issues/49#issuecomment-630710460, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE7RMYSUR5ZQP66PYW4RDVLRSJIATANCNFSM4MT47QDQ.

ghabs commented 4 years ago

Nope, haven't seen it since

On Tue, May 19, 2020 at 9:56 AM Jeremy Hendon notifications@github.com wrote:

I have not.

On May 19, 2020, at 2:42 AM, Filipe Silva notifications@github.com wrote:

Hey all, have you seen this sort of stuff happening in the last few days since my last comment?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/Roam-Research/issues/issues/49#issuecomment-630710460>, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AE7RMYSUR5ZQP66PYW4RDVLRSJIATANCNFSM4MT47QDQ .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Roam-Research/issues/issues/49#issuecomment-630948916, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA3A4J7L7T5HIGW472EKBEDRSK22NANCNFSM4MT47QDQ .

tclaiborne commented 4 years ago

Also have not seen.

arbois commented 4 years ago

this may not be another instance of the same problem but i was working on a page today for several hours, generating probably 30-40 new bullets. when i reloaded the page after closing the browser, all the new bullets—as in all 30-40 of them—had been deleted (the original content from a few days ago is still present). last edit where the new content was still visible was probably around 1900h (GMT+1) and i reloaded the page to find the content deleted probably around 2345h (GMT+1).

system info: Chrome Version 83.0.4103.61 (Official Build) (64-bit) OSX 10.14.1

arbois commented 4 years ago

update: this same roam page now

1) shows post-deletion state but synced status on my machine 2) shows pre-deletion state and synced status on another user's machine

Screenshot 2020-05-28 at 01 30 50 image image

this may not be another instance of the same problem but i was working on a page today for several hours, generating probably 30-40 new bullets. when i reloaded the page after closing the browser, all the new bullets—as in all 30-40 of them—had been deleted (the original content from a few days ago is still present). last edit where the new content was still visible was probably around 1900h (GMT+1) and i reloaded the page to find the content deleted probably around 2345h (GMT+1).

system info: Chrome Version 83.0.4103.61 (Official Build) (64-bit) OSX 10.14.1

filipesilva commented 4 years ago

@arbois that is odd... one thing to keep in mind is that the Last change in server: section is different between the two syncs, and that may hold a clue to what happened.

The first screenshot says the last change was 28/05/2020, 1:33:44 BST, the second one says it was at 27/05/2020, 20:36:19 GMT-4, which is the same as 28/05/2020, 1:36:19 BST since BST is GMT+5 at the moment. So the second screenshot says it saw changes happening some 2 and a half minutes after the first screenshot.

Now I don't know if that's an artifact from the way the screenshots were obtained. Maybe when it was opened on the other users machine some change was done intentionally or inadvertently, but it was something interesting that I noticed.

We store some cache locally that I wonder if it can be incorrect. Can you try opening it in a Chrome incognito tab? That should show you the data without any cache.

arbois commented 4 years ago

we captured the screenshots a few minutes apart bc it only occurred to me to ask the other user to screenshot the timestamp on syncstate as i was writing the bug update.

On Thu, May 28, 2020 at 8:57 AM Filipe Silva notifications@github.com wrote:

@arbois https://github.com/arbois that is odd... one thing to keep in mind is that the Last change in server: section is different between the two syncs, and that may hold a clue to what happened.

The first screenshot says the last change was 28/05/2020, 1:33:44 BST, the second one says it was at 27/05/2020, 20:36:19 GMT-4, which is the same as 28/05/2020, 1:36:19 BST since BST is GMT+5 at the moment. So the second screenshot says it saw changes happening some 2 and a half minutes after the first screenshot.

Now I don't know if that's an artifact from the way the screenshots were obtained. Maybe when it was opened on the other users machine some change was done intentionally or inadvertently, but it was something interesting that I noticed.

We store some cache locally that I wonder if it can be incorrect. Can you try opening it in a Chrome incognito tab? That should show you the data without any cache.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Roam-Research/issues/issues/49#issuecomment-635180604, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMK4RQFANWSFRSQQ66GAY3RTYKN3ANCNFSM4MT47QDQ .

arbois commented 4 years ago

so just to be clear: i had been freaking about the data loss for over an hour and had posted a few requests for info about whether anyone else had managed to roll back successfully in other similar situations. this user then pinged me to say that he was looking at the page in question and it appeared to be correctly populated—at that point, the same page was displaying in different states on our two machines. this state misalignment persisted for at least 20 minutes while he and i jointly freaked out about the possibility of such extended persistence of a state misalignment on roam. i then thought to update the bug report.

On Thu, May 28, 2020 at 9:01 AM Vaughn Tan vaughn.tan@gmail.com wrote:

we captured the screenshots a few minutes apart bc it only occurred to me to ask the other user to screenshot the timestamp on syncstate as i was writing the bug update.

On Thu, May 28, 2020 at 8:57 AM Filipe Silva notifications@github.com wrote:

@arbois https://github.com/arbois that is odd... one thing to keep in mind is that the Last change in server: section is different between the two syncs, and that may hold a clue to what happened.

The first screenshot says the last change was 28/05/2020, 1:33:44 BST, the second one says it was at 27/05/2020, 20:36:19 GMT-4, which is the same as 28/05/2020, 1:36:19 BST since BST is GMT+5 at the moment. So the second screenshot says it saw changes happening some 2 and a half minutes after the first screenshot.

Now I don't know if that's an artifact from the way the screenshots were obtained. Maybe when it was opened on the other users machine some change was done intentionally or inadvertently, but it was something interesting that I noticed.

We store some cache locally that I wonder if it can be incorrect. Can you try opening it in a Chrome incognito tab? That should show you the data without any cache.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Roam-Research/issues/issues/49#issuecomment-635180604, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMK4RQFANWSFRSQQ66GAY3RTYKN3ANCNFSM4MT47QDQ .

filipesilva commented 4 years ago

@arbois can you try the chrome incognito tab please? That should show you what the server believes to be the state without any cache. Then if that's the "right" state then you can clear your chrome browser data to get rid of the cache - but only do this when your sync dot is yellow, otherwise you can lose pending changes.

I think what you're seeing is a case of https://github.com/Roam-Research/issues/issues/265. Still not sure what actually causes it but it seems that the cached state somehow misses things.

A few questions on the topic please:

filipesilva commented 4 years ago

Another question: did you change your computer clock during the period where you lost things?

arbois commented 4 years ago

seems to be working correctly now (tested in chrome incog and opera private)—but unclear whether this is because i re-pasted the content in last night and added more content up top (i forget exactly when) after receiving it as an export RTF from the other user. i guess it would be useful to see if there was a collision server side since there would be two different states for the same page after i pasted the old content in.

On Thu, May 28, 2020 at 9:19 AM Filipe Silva notifications@github.com wrote:

@arbois https://github.com/arbois can you try the chrome incognito tab please? That should show you what the server believes to be the state without any cache. Then if that's the "right" state then you can clear your chrome browser data to get rid of the cache - but only do this when your sync dot is yellow, otherwise you can lose pending changes.

I think what you're seeing is a case of #265 https://github.com/Roam-Research/issues/issues/265. Still not sure what actually causes it but it seems that the cached state somehow misses things.

A few questions on the topic please:

  • were you working on multiple tabs simultaneously at the time?
  • were you low on disk space?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Roam-Research/issues/issues/49#issuecomment-635193280, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMK4RSVFK2VCUPJGK54EU3RTYNCRANCNFSM4MT47QDQ .

arbois commented 4 years ago

other data:

On Thu, May 28, 2020 at 9:27 AM Filipe Silva notifications@github.com wrote:

Another question: did you change your computer clock during the period where you lost things?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Roam-Research/issues/issues/49#issuecomment-635197251, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMK4RQXN2EYDMZIYYM5FFLRTYN65ANCNFSM4MT47QDQ .

filipesilva commented 4 years ago

Ok, thanks for the details. If you pasted over then you would have completely replaced everything there, whatever the state, so that would be a "new" state regardless.

I am not sure how this situation came to be, but will keep an eye out for possible causes.

arbois commented 4 years ago

i pasted it in (and added some content at the top) but the other user saw the previous state (old content without new content at top) for a while even after i made the changes

On Thu, May 28, 2020 at 9:34 AM Filipe Silva notifications@github.com wrote:

Ok, thanks for the details. If you pasted over then you would have completely replaced everything there, whatever the state, so that would be a "new" state regardless.

I am not sure how this situation came to be, but will keep an eye out for possible causes.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Roam-Research/issues/issues/49#issuecomment-635200926, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMK4RROXYPZ6Q7SFBAUKELRTYOZPANCNFSM4MT47QDQ .

filipesilva commented 4 years ago

@arbois did this happen in your daily logs or a completely separate page?

arbois commented 4 years ago

this is a page in someone else's db (it's venkatesh rao's art of gig db)

On Thu, May 28, 2020 at 7:53 PM Filipe Silva notifications@github.com wrote:

@arbois https://github.com/arbois did this happen in your daily logs or a completely separate page?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Roam-Research/issues/issues/49#issuecomment-635532195, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMK4RW2YYPPYAMRXBGOZFDRT2XJVANCNFSM4MT47QDQ .