Closed tinevez closed 5 years ago
@tinevez There is a problem with exporting/importing/exporting/importing/... to TrackMate/MaMuT See https://github.com/fiji/TrackMate3/pull/106/commits/4cbc7548fb4d498ceebb8d173b94386fa126af3d for demonstration.
Feature names get more and more inflated, e.g.,
Track N spots
-->
Track_N_spots
-->
TrackMate_Spot_features_Track_N_spots
-->
TrackMate_Spot_features_TrackMate_Spot_features_Track_N_spots
-->
...
The solution should probably be re-exporting the TrackMateImportedFeatures
with their imported names.
Now there will be a name-collision between TrackMateImportedFeatures
:Track_N_Spots
and the export name for the Mastodon Track N spots
feature. I would resolve that by the Mastodon feature taking precedence, i.e., only if Track N spots
is not in the FeatureModel
the TrackMateImportedFeatures
:Track_N_Spots
is re-exported.
Ok I will work on this.
The failed tests on Travis are probably all due to AWT being used. I modified one of the "erroring" tests to explicitly catch all exceptions and print to stdout. https://github.com/fiji/TrackMate3/pull/106/commits/ec34e21ebb4482c7f9f0d5530ac06f04cae9ebb6
Travis says:
Running org.mastodon.feature.update.UpdateStackSerializationSeriesTest
WARNING: Creating property map for a collection/pool that does not manage PropertyMaps!
WARNING: Creating property map for a collection/pool that does not manage PropertyMaps!
WARNING: Creating property map for a collection/pool that does not manage PropertyMaps!
WARNING: Creating property map for a collection/pool that does not manage PropertyMaps!
Jul 18, 2019 11:21:37 AM java.util.prefs.FileSystemPreferences$1 run
INFO: Created user preferences directory.
TrackScheme style file /home/travis/.mastodon/trackschemestyles.yaml not found. Using builtin styles.
Bdv style file /home/travis/.mastodon/rendersettings.yaml not found. Using builtin styles.
ColorMap file /home/travis/.mastodon/colormaps.yaml not found. Using builtin colormaps.
Feature color mode file /home/travis/.mastodon/colormodes.yaml not found. Using builtin styles.
Keymap list file /home/travis/.mastodon/keymaps//keymaps.yaml not found. Using builtin styles.
java.awt.HeadlessException:
No X11 DISPLAY variable was set, but this program performed an operation which requires it.
at java.awt.GraphicsEnvironment.checkHeadless(GraphicsEnvironment.java:204)
at java.awt.Window.<init>(Window.java:536)
at java.awt.Frame.<init>(Frame.java:420)
at java.awt.Frame.<init>(Frame.java:385)
at javax.swing.SwingUtilities$SharedOwnerFrame.<init>(SwingUtilities.java:1763)
at javax.swing.SwingUtilities.getSharedOwnerFrame(SwingUtilities.java:1838)
at javax.swing.JDialog.<init>(JDialog.java:272)
at javax.swing.JDialog.<init>(JDialog.java:206)
at org.mastodon.revised.mamut.TgmmImportDialog.<init>(TgmmImportDialog.java:62)
at org.mastodon.revised.mamut.ProjectManager.<init>(ProjectManager.java:112)
at org.mastodon.revised.mamut.WindowManager.<init>(WindowManager.java:163)
at org.mastodon.feature.update.UpdateStackSerializationSeriesTest.createProjectWithPendingChanges(UpdateStackSerializationSeriesTest.java:96)
at org.mastodon.feature.update.UpdateStackSerializationSeriesTest.test(UpdateStackSerializationSeriesTest.java:75)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:59)
at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:115)
at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:102)
at org.apache.maven.surefire.Surefire.run(Surefire.java:180)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:350)
at org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1021)
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.043 sec <<< FAILURE!
A simple workaround is disabling these tests on Travis, something like this https://github.com/bigdataviewer/bigdataviewer-vistools/blob/d6479aef7e4994e94b68d9c3aee6f160a188d6e4/src/test/java/bdv/util/BdvHandlePanelGarbageCollectionTest.java#L22
A better fix would be to fix ProjectManager
etc to be able to run headless.
I'm not sure how difficult that is. Maybe for now, we just create an issue for it?
I would be in favor of not disabling the tests. If we do it we fix the Travis error but nothing good comes out of it.
True, but in principle that can be done after merging this PR. (The problem is not in anything touched by this PR.)
Alright @tpietzsch this should be fixed now. Sorry it took me so long.
Thanks! I fixed the tests to run headless.
Dear Tobias,
Here is my latest attempt at getting the feature serialization right. For several of the projects Mastodon uses and that I could witness or pilot, it turned to be an important feature of the scientific workflows we want to address.
I try to put this PR in a shape that can be recycled for the documentation or the Materials and Methods of the future paper. Yet, it is in the current shape addressed to you.
Feature serialization and incremental updates.
Mastodon has the central ambition to harness the possibly very large data generated when analyzing large images. The Mastodon Feature framework offers numerical (also non numerical) data defined for the objects of the Mastodon-app model, that can be defined and created by third-parties developers. It should abide to Mastodon ambition and facilitate harnessing large data too.
Feature serialization.
Some feature can take very long to compute on large models, e.g. the spot gaussian-filtered intensity. Roughly speaking, a batch of 1000 spots takes about a second to compute. The time invested in computing these feature values should not be lost when the model is reloaded in a later session, so the Mastodon-app must offer serialization of feature values along with the model.
I remember that in my first attempt you did not like that the feature computer or the feature classes themselves had information and/or methods related to serialization. To design the current solution I simply adapted the design we already have for feature computers.
Feature serialisers.
The central interface for classes that can de/serialise a feature is
org.mastodon.feature.io.FeatureSerializer
. It is typed with the feature type and feature target type (vertex or edge):And it is a
SciJavaPlugin
because of course we want to feature serializers discoverable by a specialized service, like for feature computers. The interface defines 3 methods:because we want to know the spec of the feature we can de/serialize. Also there are de/serialization methods based on object streams:
Note that the
deserialize
method returns the exact feature type, and not a super-class. We want serializers to produce an exact feature of the right class. So we will need one feature serializer for every feature we define. This means we are not able to write generic serializers for generic features.Here is how a serializer looks like for a simple feature like
SpotNLinksFeature
:Nothing special; we reuse the property map serializers you made for the model serialization. Note however that the serializer must have access to the said property map for serialization (
default
visibility of the final fieldmap
) and to a constructor that accepts a property map for deserialization (alsodefault
visibility). Note the annotation with@Plugin
; this is how the feature serializer service will pick it up.This way we resemble the
FeatureComputer
framework a lot. The cool thing is that a feature does not have to know it is serializable. And I could make most of the Mastodon-app features serializable without modifying the feature class. Now a feature with a computer and a serializer looks like this in Eclipse:The feature serializer service.
Again, we emulate what we have for the feature computation, but simpler. There is a
FeatureSerializationService
interface that as a default implementation inDefaultFeatureSerializationService
. Both of them are generic in terms of target object class.The interface defines a single method
that returns a feature serializer for a given feature specification. That's it.
Feature serialization in Mastodon-app.
Now, let's orchestrate serialization and deserialization of features for the Mastodon-app, along with the
Model
serialization.Serialization.
I modified the
ProjectWriter
interface in theMamutProject
class so that it now has a new method:This method should return a new output stream for the feature with the specified key. In practice, both the folder and the zip versions of the writer creates a folder
features
, and store feature data in files named with the feature key appended with.raw
. For instance after serialization you will find the following in a.mastodon
zip file or in a project folder:The actual serialization logic appends in the class
MamutRawFeatureModelIO
and requiresContext
(to get the feature serialization service),FeatureModel
,GraphToFileIdMap< Spot, Link > idmap
generated when serializing the model,ProjectWriter
(used to generate the output streams for each feature).Because of these arguments, this method is called in the
ProjectManager
class, in thesaveProject(File projectRoot)
method.Also because we need the
GraphToFileIdMap< Spot, Link >
, I changed themodel.saveRaw( writer )
method to return it instead ofvoid
. I could not find a way to calls the feature serialization logic directly in this method, mainly because we need the SciJavaContext
, which is not a field of theModel
class. Another possibility would be to pass theFeatureSerializationService
to thesaveRaw()
method.Deserialization.
The deserialization happens in a similar way, but have extra logic.
First we need to know what feature to deserialize. So the
ProjectReader
interface has a new methodthat returns the keys of the features saved in the project. From each of these keys, the
ProjectReader
can generate an input stream with the method The deserialization logic also happens in the
MamutRawFeatureModelIO
class and is also called from theProjectManager
class. Here is how:FeatureSerializationService
.FeatureSpecsService
to retrieve feature specs from feature keys.FeatureSerializer
for the spec.Spot
feature or anEdge
feature, we pass the correctFileIdToObjectMap< O >
instance to the serializer.FeatureModel
.That's it.
Example.
The file
org.mastodon.mamut.feature.SerializeFeatureExample
insrc/test/java
gives an example of serialization / deserialization of features.Incremental updates during feature computation.
Serializing features is a good first step to conveniently harness large data. Thanks to serialization, the time spent on feature computation is not lost when we save and reload the data. However it is not good enough alone.
Mastodon is not limited to be a viewer of large data but allows for editing the model at any scale. You can run the detection and linking algorithms and create a large amount of data. But you can also make single spot editing and for instance change the position of one spot within a model made of several millions of them.
The possibility to edit single vertices or edges - or point-wise editing - in Mastodon creates a challenge for feature computation. Contrary to TrackMate, the Mastodon-app does not keep the features in sync with point-wise editing. Feature computation must be triggered by the user manually. This is a choice we made based on out experience with TrackMate, that becomes much less responsive when editing large models. Therefore, as soon as we make an edit, the feature values becomes invalid until the user recomputes them. This is fine, but if the model is very large, we will then need to spent a large amount of time on feature computation, while in reality possibly a small number of objects have changed. The feature incremental update mechanism aims at solving this problem.
To rely on incremental update, a feature computer has to declare a dependency on a special feature, that returns the collection of vertices or edges that changed since the last time the feature was computed. This way, it can only process these objects and not the full model.
I give some details of the incremental update mechanisms below, in reverse order of they work:
These first two paragraphs are good enough for developers that want to implement their own feature computer based on incremental update. The following two gives information about how the incremental update mechanism works in the Mastodon-app.
How the update stack objects are built when the user edit the model and trigger feature computation.
How the Mastodon-app link listeners to model changes to the update stacks to properly built it.
Using incremental updates in a feature computer.
Roughly speaking a feature computer that has incremental update looks like the following. It has a dependency on a special input, declared with SciJava
@Parameter
annotation:This
update
contains the collection of spots that changed since the last feature computation. A similar class exists for links. To get the changes for our feature, we need to call:Where
FEATURE_SPEC
is the feature specifications object. If the value ofchanges
isnull
, then the feature value must be recomputed for the full model. If not, changes can be retrieved and used for computation. TheUpdate
class has two main public methods:The
get()
method returns the collection of objects that were modified (created, moved or edited). ThegetNeighbors()
returns the collection of the neighbors of the objects that were modified. For instance, if you move a spot:get()
of the spot update;getNeighbors()
of the link update.It is up to the feature computer to decide how to use these collections to recompute values. For instance, a generic feature computer that uses the incremental computation mechanism for spot would work like this for instance:
The
SpotGaussFilteredIntensityFeatureComputer
is an example of such a feature computer (its logic is a bit more complicated because it sorts the spots per frame before computation).Incremental update and feature serialization.
Note that the feature computers that use incremental feature update must operate always on the same feature instance. Lest the feature exposed in the feature model would only contain values for the last incremental update.
So special care must be taken when using incremental update on features that can be serialized. The feature deserialization will produce a new instance of the feature, and the feature computer has a method
#createOutput()
that can too, resulting in a conflict. For this reason, it is wise in the feature computer#createOutput()
method to check if an instance of the desired feature exists already in the feature model.For instance in the
SpotGaussFilteredIntensityFeatureComputer
we have:The same caution must be applied to feature that are not computed (with a
FeatureComputer
) but updated elsewhere in other processes. For instance theDetectionQualityFeature
(inmastodon-tracking
) keeps track of the quality value of the spots that are created by aSpotDetectorOp
. It is serializable and therefore has a static method that works similarly:Building an incremental update stack.
In this paragraph we explain how the
SpotUpdateStack
andLinkUpdateStack
are built as feature computation happens. We assume that they are wired to listeners that update them with model changes, and we will describe how it is done in the next paragraph.A difficulty in returning the right changes for a feature computer, is that in Mastodon-app the user is free to select what features he wants to be computed. So the feature model can be up-to-date for some features, and not for others. So when we call
in the feature computer, the update stack must return the collection of spots that were changed or added in the model since the last time the feature with specs
FEATURE_SPEC
was calculated. For instance, consider a model made of only two spotss1
ands2
, for which two features namedA
andB
can be computed, both using the incremental update mechanism:The update stack is the core component of the incremental update mechanism. It is made of a stack of update items. Each update item works like a map from a collection of feature specs as key, and a collection of graph objects (vertices for the
SpotUpdateStack
) that were modified or added since the features were calculated as value. In reality it is more aPair
than aMap
but I use this vocable here.On the example described in the drawing above, at t0 the features are not computed. The update stack is initialized with a single item with empty key and empty value.
At t1, the user triggers a computation of both features
A
andB
. Because none of the feature specs forA
andB
can be found in the update stack, the#changesFor( FEATURE_SPEC )
method returnsnull
, which triggers a computation of the feature values for all the spots of the model.At t2 once the computation is finished, a new update item is pushed on stack. It is initialized with an empty collection for value, and the specs of the features that were calculated (
A
andB
) are stored as key.At t3 the user moves the spot
s1
. Because there are some listeners wired to the update stack,s1
is added to the value collection of the top item in the stack. That is: the one withA
andB
specs as key that was created afterA
andB
computation. Both featuresA
andB
are marked as not up-to-date.At t4 the user triggers the computation of feature
A
only. Since the feature computer forA
uses incremental feature update, it queries the changes for its feature. A call for#changesFor( A )
does the following:A
, iteration stops, and the method returns the value of this item.s1
, so the feature computer recompute the feature value only fors1
.At t5, after computation, a new update item is added to the update stack. Since we computed only
A
, its key contains onlyA
specs. As before, the item is initialized with an empty collection as value.A
is marked up-to-date.At t6 the user moves the spot
s2
. As before, it is added to the value collection of the top item in the stack. This time, it is the one withA
specs as key.Now at t7 the user wants to compute feature
B
, which is not up-to-date since t2. Again, the feature computer forB
uses incremental feature computation. The call to#changesFor( B )
does the following:B
specs, so we move down to the next item.B
specs. We stop there.s1
.s2
, and concatenate with it. We end up with a collection made of{ s1, s2 }
.B
only fors1
ands2
which makes it up-to-date.After this computation (t8) a new update item is pushed to the update stack, with
B
specs as key.And it goes on like this. If after the steps exemplified here the user would recompute all features, the changes for
B
would be empty, and the changes forA
would be built by iterating to the second item in the stack, that contains onlys2
.The stack itself has a limited capacity. It can stores 10 update items, after that the old items are discarded. This results in triggering the full computation for 'forgotten' feature updates.
Registering the incremental update in feature computation.
The following describes how we provide the changes to the feature computer. We give it specifically for the Mastodon-app (
Spot
andLink
).When the
MamutFeatureComputerService
receives theModel
instance to operate on, first the update stacks are created:They are created via static methods
getOrCreate()
, and we will explain why later.We then add several listeners to the graph:
These listeners are defined in the
GraphFeatureUpdateListeners
and consist of listeners that feed changes of the graph to the two update stacks. For instance, the one that listens to vertex properties is as follow:This ensures that the two update stacks will receive the changes.
Also, the update stack class that is the super class for the spot and link update stacks implements
Feature< O >
:This will be important for serialization, as we will see below. It also has the advantage that we do not have to do anything special to provide it to feature computers. Since it is a feature, it will be provided to feature computers that declare it as a dependency as any other feature.
Note that the feature computers that use incremental feature update must operate always on the same feature instance. Lest the feature exposed in the feature model would only contain values for the last incremental update.
So special care must be taken when using incremental update on features that can be serialized. The feature deserialization will produce a new instance of the feature, and the feature computer has a method
#createOutput()
that can too, resulting in a conflict. For this reason, it is wise in the feature computer#createOutput()
method to check if an instance of the desired feature exists already in the feature model.For instance in the
SpotGaussFilteredIntensityFeatureComputer
we have:The same caution must be applied to feature that are not computed (with a
FeatureComputer
) but updated elsewhere in other processes. For instance theDetectionQualityFeature
(inmastodon-tracking
) keeps track of the quality value of the spots that are created by aSpotDetectorOp
. It is serializable and therefore has a static method that works similarly:Serialization of the incremental update state.
The incremental update mechanism leads to new programming challenges along with feature serialization.
Indeed, the feature computation is triggered by the user on demand. So the feature values might not be up-to-date with the model when we serialize them. In such cases, the features have to be recomputed for the whole model after reloading. This voids the advantage brought by the incremental feature computation: We loose the benefit of recomputing features for just the spots and links that have been modified after reloading, and the time spent on computing is lost after saving the model.
The solution to this is evident: we have to serialize the update stacks along with the model and the feature values. This is what is done currently when the model is saved. We give here details about how this happens.
Update stacks objects are features.
SpotUpdateStack
andLinkUpdateStack
both inherit fromUpdateStack< O >
which implementsFeature< O >
. So these objects are features. They have however 0 feature projections and cannot be used in a feature color mode and are not displayed in the data table. They are features because this makes it very convenient to:store them somewhere that makes sense in the application. They are declared in the
FeatureModel
and can be retrieved from it by consumers that need to play with theFeatureModel
anyway.provide them as an input to feature computers, using
@Parameter private SpotUpdateStack stack;
as seen above. The feature computation service will handle them with no added logic.serialize them along with the other features. This is what we will see now.
We could give them meaningful projections. For instance - and that would be helpful for debugging - we could have one projection per update item in the stack that returns an
int
number whether a spot or link is marked as changed, neighbor or changed or not changed. But for now, they are stowaway features of the feature model.both have matching serializers that inherit from
UpdateStackSerializer< F, O>
Serialization of update stack objects.
Because they are features, we just need to provide an implementation of
FeatureSerializer
for them, that will be handled automatically by the feature serialization service described in the first section of this document.There exists for convenience an abstract class
UpdateStackSerializer
:It handles the serialization of any subclass. The
UpdateStack
has a unique field to serialize: the stack of update items itself. So we serialize in order:FeatureSpec
used as key for the update item;RefSet
s, one for the object directly changed and one for the object neighbors of direct changes.A project saved with update stacks looks like this on disk:
Deserialization of update stack objects.
Deserialization happens in reverse, but because concrete serializers have to return a new instance of the right class of
UpdateStack
implementation, theUpdateStackSerializer
is abstract and offers instead a method:which returns a new stack of update item, that can be used to instantiate the right class.
In the update items we use
FeatureSpec
s as key.s Notice that we do not serialize nor deserialize the true class of eachFeatureSpec
but a generic one made of the right field (key, info, multiplicity, …). Since in incremental updates we only use the#equals()
method ofFeatureSpec
, and that it is based on some of its fields, this is ok. But because this is only true for incremental update, theFeatureSpec
serialization only has package visibility.JUnit tests.
In
org.mastodon.feature.update
ofsrc/test/java
there are two JUnit tests that serialize a model with pending changes for incremental feature computation, and test for proper computation after reloading the model.