deepstreamIO / deepstream.io

deepstream.io server
https://deepstreamio.github.io
MIT License
7.14k stars 381 forks source link

Invalid state transition. Details: {"transition":"MERGED","state":"READY"} #1078

Closed savinskiy-konstantin closed 3 years ago

savinskiy-konstantin commented 4 years ago

I've faced with the following error for some records during trying to update record:

Invalid state transition.
Details: {"transition":"MERGED","state":"READY"} 
History: 
    From - to LOADING_OFFLINE via -
    From LOADING_OFFLINE to SUBSCRIBING via 23
    From SUBSCRIBING to READY via 4

StateMachine.transition             @ state-machine.ts:37
RecordCore.onRecordRecovered        @ record-core.ts:767
eval                                @ merge-strategy-service.ts:75
exports.REMOTE_WINS                 @ merge-strategy.ts:8
MergeStrategyService.merge          @ merge-strategy-service.ts:74
RecordCore.recoverRecordFromMessage @ record-core.ts:726
RecordCore.RecordHandler            @ record-core.ts:480
RecordHandler.handle                @ record-handler.ts:411
eval                                @ connection.ts:323
Connection.onMessages               @ connection.ts:297
socket.onmessage                    @ socket-factory.ts:42

Code is simple:

const { DeepstreamClient } = window.DeepstreamClient;
const client = new DeepstreamClient(url, { path: '' } );

client.login((success) => {
  if (success) {
    client.record.getRecord('user').whenReady((record) => {
      record.set('test', 1);
    });
  }
});

There are many following messages in pm2 error logs:

VERSION_EXISTS | open tried to update record user to version 19633 but it already was 19632

Server version: 5.1.1 Client version: 5.1.4

What is the reason for the error? How can the error be fixed?

Thank you

jaime-ez commented 4 years ago

Hi, I can not reproduce the issue based on the code you share. Seems to me you might have the offline functionality enabled and thus the record version from the client and server differ?

yasserf commented 4 years ago

The error itself is due to probably missing a state transition in the record handler, since it is finite and we need to output the states ourselves. That being said, going from ready to merged and skipping merging state seems odd.

My only guess is either you have other code running we are not aware about, since a single write can’t produce more than one conflict, or something managed to enter an infinite loop to produce many conflict logs. Given the simplicity of the code I feel like I would support the former.

Eitherways this issue can’t be resolved on our side without better reporting / logs or best of all, reproducible code.

jaime-ez commented 3 years ago

Closing this for now. Please re-open if issue persists.

savinskiy-konstantin commented 3 years ago

I got back to this problem.

Seems to me you might have the offline functionality enabled and thus the record version from the client and server differ?

offlineEnabled is disabled by default.

We have a simple architecture. The client creates/updates their own record and subscribed to others. But sometimes with some records happens described above issue. We can not detect steps to reproduce.

Can you help with troubleshooting? I see these "broken" records. What I can check to detect what is wrong with this record and why I cannot update the record?

Thanks

yasserf commented 3 years ago

So the error is being thrown due to a lack of a transition here:

https://github.com/deepstreamIO/deepstream.io-client-js/blob/master/src/record/record-core.ts#L933

So my guess is you might be updating the records while the client is offline or if another process/client updates it and there's weird state.

The actual issue seems to be here:

https://github.com/deepstreamIO/deepstream.io-client-js/blob/master/src/record/record-core.ts#L530

Where we are recovering a message but we haven't put the record into a recovering state. We need to transition using this:

https://github.com/deepstreamIO/deepstream.io-client-js/blob/master/src/record/record-core.ts#L956

So basically:

    if (message.action === RECORD_ACTION.VERSION_EXISTS) {
      this.stateMachine.transition(RECORD_ACTION.VERSION_EXISTS)
      this.recoverRecordFromMessage(message as RecordWriteMessage)
      return
    }

Not sure about the side effects, but that should fix this specific issue.

savinskiy-konstantin commented 3 years ago

I've added your code but it does not help. I've gets two errors instead of one.

yasserf commented 3 years ago

Could be because we are using the wrong actions, can you try this line?

  this.stateMachine.transition(RECORD_OFFLINE_ACTIONS.VERSION_EXISTS)
savinskiy-konstantin commented 3 years ago

The error in the console is gone with this line but the record doesn't update anyway.

yasserf commented 3 years ago

awesome. That issue is because your merge strategy is remote wins.

exports.REMOTE_WINS @ merge-strategy.ts:8

You need to set merge strategies to LOCAL_WINS if you expect them to work.

Generally worth looking into your code to see if your updating records when the connection is offline/not yet ready. But eitherways that should be a solution.

savinskiy-konstantin commented 3 years ago

Thanks.

I will review cases when the connection is offline or not yet ready.

What about LOCAL_WINS, it doesn't help.

I cannot set merge strategy for connection how described in docs https://deepstream.io/tutorials/core/datasync/handling-data-conflicts/. Please clarify this for the new client version.

I've tried the following code:

const { DeepstreamClient } = window.DeepstreamClient;
client = new DeepstreamClient(serverUrl, { path: '' } );

client.login({}, function () {
    var currentUserRecord = client.record.getRecord(recordName);
    currentUserRecord.whenReady(function (record) {
        record.setMergeStrategy(function(localValue, localVersion, remoteValue, remoteVersion, callback) {
            callback(null, localValue);
        })
        record.set('test', 1);
    });
});

With this code, I get a looped function call and data on the server does not update

savinskiy-konstantin commented 3 years ago

I've been able to reproduce the issue by following steps.

  1. Code with updating record by interval
    
    var recordName = 'test';
    const { DeepstreamClient } = window.DeepstreamClient;
    client = new DeepstreamClient(serverUrl, { path: '' } );

client.login({}, function () { var currentUserRecord = client.record.getRecord(recordName); currentUserRecord.whenReady(function (record) { var updateRecord = function() { var currentdate = new Date(); var datetime = currentdate.getMinutes() + ":" + currentdate.getSeconds(); record.set('time', currentdate, err => { if (err) { console.log('Record set with error:', err) } else { console.log('Record set without error') } }); } var updateRecordInterval = setInterval(updateRecord, 1000); }); });


2. Disconnect the connection (disable internet, etc.)
3. Then I get the error in the console and cannot update the record anymore

I think this case should be handled by Deepstream Client
yasserf commented 3 years ago

thank you for the code! yeah that should be covered.

I'm currently a bit busy on another deadline, but will pick this up as soon as I have time.

rom-melnik commented 3 years ago

+1

yasserf commented 3 years ago

I will not have time to do this as I'm working more in the capacity of PR reviewer for now. If anyone feels like fixing this in deepstream I'll be happy to review it.

jaime-ez commented 3 years ago

Hi @savinskiy-konstantin I still can't reproduce the issue. This is my setup:

server.js using server v5.2.2

const { Deepstream } = require('@deepstream/server')

const server = new Deepstream({})
server.start()

index.html

<!DOCTYPE html>
<html>
  <head>
    <script src="https://cdn.jsdelivr.net/npm/@deepstream/client@5.2.1/dist/bundle/ds.min.js"></script>
  </head>
  <body>
    <input type="text" />
    <script type="text/javascript">
      window.onload = function () {
        //js goes here
        const { DeepstreamClient } = window.DeepstreamClient
        const client = new DeepstreamClient('ws://192.168.0.12:6020/deepstream')
        const recordName = 'test'
        client.login({ username: '1' }, () => {
          var currentUserRecord = client.record.getRecord(recordName);
          currentUserRecord.whenReady(function (record) {
              var updateRecord = function() {
                  var currentdate = new Date();
                  var datetime = currentdate.getMinutes() + ":" + currentdate.getSeconds();
                  record.set('time', currentdate, err => {
                      if (err) {
                          console.log('Record set with error:', err)
                      } else {
                          console.log('Record set without error')
                      }
                  });
              }
          var updateRecordInterval = setInterval(updateRecord, 1000);
          })
        })
      }
    </script>
  </body>
</html>

Then I go to index.html on the browser, and see the record without error on the console. Disable wifi, no logs to the console, and if enabled in less than a minute, it reconnects and logs success. else it will log client offline, and after reconnecting continues to log success.

Please let me know which steps you took in order to reproduce the error, cheers

max-rudnev commented 3 years ago

Hello, everyone.

Faced with the same (a similar) issue. Managed to reproduce it using the code from the @savinskiy-konstantin's comment, but:

  1. with a smaller interval of 10 ms;
  2. and running the code in two browser tabs.

Once the code is ran in the second tab and some messages are sent, the following error occurs: Screenshot_1

Server version: 5.2.3. Client version: 5.2.1.

jaime-ez commented 3 years ago

I think the issue is here: https://github.com/deepstreamIO/deepstream.io-client-js/blob/master/src/record/record-core.ts#L845 the stateMachine is already in ready state, we already recovered the record, changin it to:

    if (utils.deepEquals(oldValue, remoteData)) {
      if (this.stateMachine.state !== RECORD_STATE.READY) {
        this.stateMachine.transition(RECORD_OFFLINE_ACTIONS.MERGED)
      }
      return
    }

has worked for me.

Can you check if it works for you ?

max-rudnev commented 3 years ago

Hello, everyone.

@jaime-ez, thank you for your response. Unfortunately, your solution doesn't help in my case. It is still reproducible with the modification.

Also, in my case, the error occurs after transiting the state on this line: https://github.com/deepstreamIO/deepstream.io-client-js/blob/master/src/record/record-core.ts#L851.

jaime-ez commented 3 years ago

I was testing based on the case you mentioned. I think it's weird you're seing the error on the line you mention, since the onRecordRecovered function should be called once, sync the record, and then subsequent calls should exit on the line I mentioned before since the record would be already recovered...can you add the if block I proposed where you see the error ocurring and see what happens?

max-rudnev commented 3 years ago

can you add the if block I proposed where you see the error ocurring and see what happens?

Retested the case. The error occurs without either of the try blocks. It doesn't occur with both the try blocks only: Screenshot_1

jaime-ez commented 3 years ago

Well then put the if block once on line 844 (before both cases tested) and that should cover all out of sync cases. Let us know how it goes

max-rudnev commented 3 years ago

I've retested the case. Now the following error occurs: Screenshot_1 Screenshot_2 when transiting the state on this line: Screenshot_3

yasserf commented 3 years ago

i'll take a look at this this weekend, seems quite a few people are running into it

yasserf commented 3 years ago

So using the code samples above I found the issue if we were transitioning from ready to merged which is invalid. So since the data is already the same (meaning nothing actually changed) I just put a guard to say if it isn't in the merging state then just ignore it.

Hopefully that fixes it, in 5.2.2

max-rudnev commented 3 years ago

@yasserf, thank you for your work. Please notice that in my case the error also occurs in case of non-equal data (please see the comment). Therefore, this case is still reproducible with 5.2.2.

yasserf commented 3 years ago

Can you please add a code segement that replicates the issue (eg: https://github.com/deepstreamIO/deepstream.io-client-js/blob/master/bugs/1078-mergeconflict.ts) and using that we can put in a fix. Thanks

max-rudnev commented 3 years ago

I reproduce the issue on your code with a smaller interval of 10 ms and running it in two browser tabs.

The full code:


<!DOCTYPE html>
<html>
    <head>
        <script src="/js/ds.js"></script>
    </head>
    <body>
        <script type="text/javascript">
          const test = () => {
            var recordName = 'test';

            const { DeepstreamClient } = window.DeepstreamClient;
            client = new DeepstreamClient('wss://<hostname>:6020/', { path: '' });

            client.login({}, function () {
              const currentUserRecord = client.record.getRecord(recordName)
              currentUserRecord.whenReady(function (record) {
                var updateRecord = function () {
                  currentUserRecord.set({data: Math.random()}, err => {
                    if (err) {
                      console.log('Record set with error:', err)
                    } else {
                      console.log('Record set without error')
                    }
                  });
                }
                setInterval(updateRecord, 10);
              });
            });
          };

          if (window.addEventListener) {
            window.addEventListener('load', test, false);
          } else if (window.attachEvent) {
            window.attachEvent('onload', test);
          }
        </script>
    </body>
</html>
jaime-ez commented 3 years ago

Ok then, I think I got it:

Well then put the if block once on line 844 (before both cases tested) and that should cover all out of sync cases. Let us know how it goes

This is correct for eliminating the incorrect state transition that originated this issue: {"transition":"MERGED","state":"READY"}

Now for the second error described by @max-rudnev {"transition":"MERGED","state":"RESUBSCRIBING"} the thing is that the resubscription to the record coming back from the disconnection is handled via the bulkSubscriptionService here

Since the default subscriptionInterval is 100 ms, thus if we are updating with an interval less than 100 ms will have a "race condition" that generates that invalid state transition. This can be fixed by initialising the client with the option subscriptionInterval: 0.

Please test and let us knwo if it works.[

yasserf commented 3 years ago

hey!

I put in a fix in 5.2.3 that basically stops invalid merge conflicts from happening. However as correctly stated by @jaime-ez, the issue is due to a race condition we don't have much control with. If your updating things at a rate of 10milliseconds a second with an async merge conflict handler we just drop merge conflicts that happen concurrently. Coming up with a better solution is overkill + it will record on the next update anyways.

Hopefully that solves this issue. It will log a warning so that you can be aware how often it happens. My advice would be not to send updates quicker than a single screen refresh frame (or ideally a few) if you want to avoid that behavior.

jaime-ez commented 3 years ago

Closing for now, please reopen if other cases arise.