letsfindaway / OpenBoard

I'm using this fork to contribute features and fixes to the upstream project. In order to create good pull requests, I'm rebasing my feature branches, squashing and reordering commits, etc. If you fork this repository be aware that my development branches may rewrite history without prior notice.
http://openboard.ch/
GNU General Public License v3.0
9 stars 0 forks source link

[Bug] Asynchronous loading of adjacent pages fails #171

Closed letsfindaway closed 2 weeks ago

letsfindaway commented 6 months ago

Describe the bug

When opening a document with multiple pages from the document view, you can see the following message in the log output:

QObject::connect: Cannot queue arguments of type 'std::shared_ptr<UBDocumentProxy>'
(Make sure 'std::shared_ptr<UBDocumentProxy>' is registered using qRegisterMetaType().)

The reason is that adjacent pages are asynchronously loaded using the UBPersistenceWorker. When it is finished, it emits https://github.com/letsfindaway/OpenBoard/blob/0e69a0a03248cbb2a5914fdbbc04772869c5836a/src/core/UBPersistenceWorker.cpp#L81 The second parameter to this signal is of type std::shared_ptr<UBDocumentProxy>, which is not known to the meta type system and can therefore not be sent over a queued connection.

To Reproduce

  1. Start OpenBoard, create a multi-page document and close it again
  2. Start OpenBoard a second time
  3. Go to document mode and select the previously created document
  4. Click on "Opening Board"

Expected behavior

The first (or the selected) document page should appear. Adjacent pages should be loaded asynchronously to the scene cache.

Actual behavior

The first (or the selected) document page appears as expected. But loading adjacent pages fails and is broken in several aspects:

Additional resources

The following are just some assumptions, but might provide background for further analysis.

Context

letsfindaway commented 6 months ago

Registering the metatype solves the issue with the signal:

qRegisterMetaType<std::shared_ptr<UBDocumentProxy>>("std::shared_ptr<UBDocumentProxy>");

But I would also like to use the loaded text for creating the scene. Are there any known reasons why this was not done and instead the loaded text explicitly marked as Q_UNUSED?

The line in the code comes from commit 7dea264ef25dd9e42195fa22c05a7e35a3896d5b, where @kaamui restored Watsaig's work on persistence, activating the UBPersistenceWorker.

letsfindaway commented 6 months ago

Scene identification by index is unsafe

Note also that in this asynchronous case the scene index does not uniquely identify the scene. In the (unlikely) case that the user inserted another scene before the worker thread is ready and emitted the signal, the scene indexes have shifted by one and the index delivered in the signal no longer identifies that scene.

We should think about another scene identification which can be created for each index after reading the metadata (and for each newly created scene) and which is unique and does not depend on the index of the scene.

As a scene is a UBItem and the UUID is also stored in the SVG document, would it be better to identify them by their UUID, at least in this case? But we don't know the UUID before we have loaded the scene. One approach could be to "peek" into the document file, reading the first two lines as we know that the UUID is in the second line with the <svg> element.

Note that a scene index is perfect in synchronous use cases, so I don't propose to change that in general. It is just not sufficient in an asynchronous use case.

kaamui commented 6 months ago
  • For some unknown reason however loadDocumentScene is used, which again reads the scene from disk. Why not using

I remember that I had to do this change to prevent data loss, but can't remember the use cases.

letsfindaway commented 6 months ago
  • For some unknown reason however loadDocumentScene is used, which again reads the scene from disk. Why not using

I remember that I had to do this change to prevent data loss, but can't remember the use cases.

One could argue that after reading the file content once in the worker thread, it is most probably cached, so that subsequent reads are faster. In this case it is not necessary to send the data over the signal. The trigger using the proxy and scene index is sufficient.

Then the above mentioned problem with the scene index in asynchronous scenarios vanishes. The worst thing which can happen is that the wrong page is cached.

The above-mentioned commit where this code was added occurred before we switched to shared_ptr for the proxy. So at this time the signal was delivered. To restore this situation I would propose to

letsfindaway commented 6 months ago

... or we just remove all that stuff. See https://github.com/letsfindaway/OpenBoard/issues/163#issuecomment-1985840017. The creation of the thumbnails loads all document scenes anyway. So there is no benefit from trying to load adjacent pages in a worker thread.

If we would like to improve this, then we have to use the static pixmaps of the pages to initially fill the board thumbnails and only replace them with the live view when a scene becomes active.

letsfindaway commented 6 months ago

To decide how to proceed it is first important to ask what the problem actually is. It it just a log message we want to get rid of? Or is it a performance issue in networked environments? If yes, is it actually the problem to load all the SVG files over the network? what about the images, videos and other objects, which are currently not addressed by asynchronous loading? This is currently not clear to me, but the solution depends on the answer to that question. It could be

My current view on this:

To come back to the first sentence of this comment: This all is only valid if the assumption is correct that network data transfer is a bottleneck and affects user experience.

letsfindaway commented 5 months ago

The issue https://github.com/OpenBoard-org/OpenBoard/issues/918 opens another perspective to this issue - and increases importance. With this huge document provided by the issue author you can see that the performance problem not only occurs in networked environments, but also with local data. Creating ten thousands of stroke objects requires some time. All the documents together have more than 340000 strokes!

grep "<polygon" *.svg |wc -l
346529

So my conclusion is now:

Modify thumbnail building and use the pixmaps in a first step and replace them by the scene only when a page is active on the board. The pixmaps are probably loaded anyway for the document mode.

The author of that bug also mentions:

Also it eats up all my memory which is why it locks up.

This moves #135 into my focus. To avoid such situations where OpenBoard uses all available memory we have to improve the retention policy of the scene cache. So additionally of what I wrote above: