eclipse-emfcloud / emfcloud-modelserver

Modelserver component
Other
43 stars 21 forks source link

FileModelWatcher seems to live forever #192

Closed eneufeld closed 2 years ago

eneufeld commented 2 years ago

I'm encountering weird error logs in my applications running the modelserver lately. The logs look like this:

root ERROR ModelServerBackendContribution: Exception in thread "Thread-15" Exception in thread "Thread-14" Exception in thread "Thread-27" Exception in thread "Thread-21" 
root ERROR ModelServerBackendContribution: java.lang.IllegalStateException: Cannot modify resource set without a write transaction
    at org.eclipse.emf.transaction.impl.TransactionChangeRecorder.assertWriting(TransactionChangeRecorder.java:349)
root ERROR ModelServerBackendContribution: 
    at org.eclipse.emf.transaction.impl.TransactionChangeRecorder.appendNotification(TransactionChangeRecorder.java:303)
    at org.eclipse.emf.transaction.impl.TransactionChangeRecorder.processResourceNotification(TransactionChangeRecorder.java:273)
    at org.eclipse.emf.transaction.impl.TransactionChangeRecorder.notifyChanged(TransactionChangeRecorder.java:239)
root ERROR ModelServerBackendContribution: 
    at org.eclipse.emf.common.notify.impl.BasicNotifierImpl.eNotify(BasicNotifierImpl.java:424)
    at org.eclipse.emf.common.notify.impl.NotifyingListImpl.dispatchNotification(NotifyingListImpl.java:261)

root ERROR ModelServerBackendContribution:  at org.eclipse.emf.common.notify.impl.NotifyingListImpl.clear(NotifyingListImpl.java:1099)
    at org.eclipse.emf.ecore.resource.impl.ResourceImpl.doUnload(ResourceImpl.java:1699)
    at org.eclipse.emf.ecore.xmi.impl.XMLResourceImpl.doUnload(XMLResourceImpl.java:720)

root ERROR ModelServerBackendContribution:  at org.eclipse.emf.ecore.resource.impl.ResourceImpl.unload(ResourceImpl.java:1721)
    at org.eclipse.emfcloud.modelserver.emf.common.DefaultModelResourceManager.closeResource(DefaultModelResourceManager.java:300)
    at org.eclipse.emfcloud.modelserver.emf.common.DefaultModelRepository.closeModel(DefaultModelRepository.java:157)
root ERROR ModelServerBackendContribution: 
    at org.eclipse.emfcloud.modelserver.emf.common.watchers.ReconcilingStrategy$AlwaysReload.reconcileModel(ReconcilingStrategy.java:60)

root ERROR ModelServerBackendContribution:  at org.eclipse.emfcloud.modelserver.emf.common.watchers.AbstractModelWatcher.reconcile(AbstractModelWatcher.java:52)
root ERROR ModelServerBackendContribution: 
    at org.eclipse.emfcloud.modelserver.emf.common.watchers.FileModelWatcher.handleEvent(FileModelWatcher.java:167)
    at org.eclipse.emfcloud.modelserver.emf.common.watchers.FileModelWatcher.run(FileModelWatcher.java:129)
    at java.base/java.lang.Thread.run(Thread.java:829)

Debugging the ModelServer I see this behavior: Whenever I refresh the page the FileModelWatcher.Factory is called for the model resources and creates a new ModelWatcher. Did I miss a destruction call somewhere or do I need to configure something differently?

@vhemery do you have an idea?

vhemery commented 2 years ago

About the closure, the FileModelWatcher should be stopped by the adapter in org.eclipse.emfcloud.modelserver.emf.common.watchers.AbstractModelWatcher.addStopAdapter() method. You can try adding checkpoints there, to check whether :

About the log stack itself, EMF transaction is complaining that we are modifying a resource set while not in a transaction. The org.eclipse.emfcloud.modelserver.emf.common.DefaultModelResourceManager.closeResource(String) method seems to be in cause, as it does not perform the closing and reloading in a command. (My custom implementation, which was using a transactional editing domain before the default one, does wrap it in a command. I think I implemented it before noticing the model server was going full transactional.)

So I think we should just wrap https://github.com/eclipse-emfcloud/emfcloud-modelserver/blob/5808b7e7fb0b17a3eda8ab115ae34b6f4cdfebbd/bundles/org.eclipse.emfcloud.modelserver.emf/src/org/eclipse/emfcloud/modelserver/emf/common/DefaultModelResourceManager.java#L300 to https://github.com/eclipse-emfcloud/emfcloud-modelserver/blob/5808b7e7fb0b17a3eda8ab115ae34b6f4cdfebbd/bundles/org.eclipse.emfcloud.modelserver.emf/src/org/eclipse/emfcloud/modelserver/emf/common/DefaultModelResourceManager.java#L312 in a command execution.

Then, the close shouldn't fail, hence the resource is correctly closed, and finally the FileModelWatcher is stopped naturally.

More generally, we should ensure in DefaultModelResourceManager (and other implementations), that no operation ever modifies the resource set without a transaction / being executed in a command.

vhemery commented 2 years ago

Since I did the test with the client (and not my own), I'm also surprised I didn't encountered this on the test client... It should have failed there too... 🤔

eneufeld commented 2 years ago

Thank you for the fast reply. I try debugging with your pointers. The error happens for me when I refresh the client at least once and trigger a save on a resource.

vhemery commented 2 years ago

What do you mean by "refresh the client" ? Is it modifying the file and resource reloading is triggerred by FileModelWatcher ? Or refreshing the UI frontend ? I suspect the resource is loaded twice, in which case the save of the second resource would trigger the FileModelWatcher reloading of the first resource...

eneufeld commented 2 years ago

By refresh the client I meant refreshing the UI frontend by reloading the webpage.

Yes your suspicion makes sense. I will debug now and let's see what I find.

eneufeld commented 2 years ago

So I added the breakpoints you suggested. And refreshing the webpage (eg pressing F5) does not remove the watcher.

vhemery commented 2 years ago

Yes, but does it load another resource ? This should not be an issue if the same resource is still used. But that is an issue if we load the same resource twice.

Nevertheless, the close method is still bugged and should be fixed as suggested in my first comment. (But that's a good revealer for an eventual major leak)

eneufeld commented 2 years ago

I think I found the missing part, the resources were not unloaded when a new configuration was set. See the linked PR.

vhemery commented 2 years ago

I think I found the missing part, the resources were not unloaded when a new configuration was set. See the linked PR.

That should solve the leak (which was the hardest part to detect). 👍 But the close method should still be fixed to use a command... or the same issue could raise again in a legit context, where we only want to update the resource after an external change.

eneufeld commented 2 years ago

good point, I improve this.