Open jburel opened 4 years ago
omero-marshal
we load a lot of extra JSON object.details that we don't need.I started looking at optimised querying of OMERO via projection to improve performance, only loading bare minimum values.
For example, loading 1104 ROIs (11169 Points) from a local copy of this Image:
http://idr.openmicroscopy.org/webclient/?show=image-9846159
using the webgateway/get_rois_json/
in the old viewer took about 4 seconds for the query using the
ROI Service.
Using the query service to load minimal coordinates via a projection reduced this to about 1 second:
query = """
select shape.id,
shape.roi.id,
shape.x,
shape.y,
shape.theZ,
shape.theT
from Shape shape
join shape.roi roi
where roi.image.id=:id"""
params = omero.sys.ParametersI()
params.addId(imageId)
points = conn.getQueryService().projection(query, params, conn.SERVICE_OPTS)
rois = {}
for p in points:
[id, roi_id, x, y, z, t] = p
roi_id = str(unwrap(roi_id))
if roi_id not in rois:
rois[roi_id] = []
rois[roi_id].append({
'id': unwrap(id),
'x': unwrap(x),
'y': unwrap(y),
'z': unwrap(z),
't': unwrap(t)})
return {'rois': rois}
Looking at reducing the size of JSON that omero_marshal
provides:
The full 1104 ROIs with Shapes for image above is 8.3 MB of JSON at /api/v0/m/rois/?image=2822&limit=5000
If we don't marshal the details of each object (comment these lines from the base Encoder class)
# if hasattr(obj, 'details') and obj.details is not None:
# encoder = self.ctx.get_encoder(obj.details.__class__)
# v['omero:details'] = encoder.encode(obj.details)
then the JSON goes down to 765 kb and the time to load (testing on local dev server) goes from about 4 (or 5) secs to about 3 seconds.
It seems a small diff in time for diff in size. It will be good to check with remote server
I haven't looked into implementing it, but my suggestion would be that a cache is kept of the objects which have been encode (minimally those that are in the details) and that only the first instance is encoded, while all others are referred to by @id
. That of course is likely independent of the speed issue.
Using the projection query above, and marshalling similar to omero_marshal
(without details) results in 1.7 MB of JSON in just over 1 second and allows iviewer to display all 1104 ROIs but they are not 'working/interactive'. Need to work out what omero:details
iviewer needs.
Using projection query above and adding 'fake' details (see https://github.com/will-moore/omero-web/commit/5506e7f045119637202d3c4c5fa7e9927fee0cb2) allows full display of ROIs, loading 5.2 MB of JSON in 1.35 secs:
So, we're not saving a huge amount of JSON data and marshalling, but we're saving quite a bit of time loading Points coordinates via projections instead of loading full Shape objects.
Loading permissions via the projection call (using map query) slows the call down to around 9.5 seconds. See https://github.com/will-moore/omero-web/commit/8b71e941991a327f0fc8e42c6407864a39d0ccfa This is even slower than loading whole shape objects. So we need an alternative for loading permissions.
@joshmoore The problem with caching encoded omero:details
is that the details and permissions don't have IDs.
e.g. see https://docs.openmicroscopy.org/omero/5.6.0/developers/json-api.html#list-projects
But the limiting factor currently seems to be the query itself, not the overall size of JSON, although marshalling might be part of the problem.
I wonder if we want to treat the 'browsing analysis results' with many ROIs that we don't want to edit as a different use-case from 'manually editing ROIs' where the numbers loaded are small and we care about permissions of each one?
If we can just say "Over 1000 ROIs, everything is read-only" and then we can avoid loading permissions?
If we can just say "Over 1000 ROIs, everything is read-only" and then we can avoid loading permissions?
Running tools like trackmate or ilastik on the type of data I am looking at will easily generate that amount of ROIs, so we cannot assume that we will be in a read-only mode
So, we're not saving a huge amount of JSON data and marshalling, but we're saving quite a bit of time loading Points coordinates via projections instead of loading full Shape objects.
What about other shapes, for trackmate I use ellipse so we can preserve the shapes between trackmate/ome
But do you want to edit your TrackMate ROIs in iviewer? What about the IDR use-case?
Maybe an alternative is to load the owner.id of each Shape and then calculate the permissions based on group perms?
I'll give that a try.
I expect I can easily load radiusX and radiusY for ellipses, other other coordinates for rectangle, line, arrow without much performance hit. Not so sure about points for polygon/polyline.
Also not sure with projection queries to know what type
of shape it is? Easy enough to work out most types based on the coordinates you have, but don't know about polygon vv polyline.
I can see us deleting the ROIs that are false positive for example. IDR is a read-only server. Let's not confuse the two.
OK, https://github.com/will-moore/omero-web/commit/113c96b9902998908b19ed8fc6657cad51d4eba8 explores the possibility to infer permissions from a shapes owner and group. This doesn't take into account whether the user is an admin or group owner (could be added) but it should allow editing of Shapes. Will look at support for other shapes...
Ah - I realised above (https://github.com/ome/omero-iviewer/issues/335#issuecomment-680100167) when testing how fast the JSON API loads all 1104 ROIs & Shapes with /api/v0/m/rois/?image=2822&limit=5000
that we were only loading 500 ROIs because of the omero.web.api.max_limit
. Having set this to 5000, I see it takes:
Without any details (commenting out of omero-marshal as above)
Loading whole Shape objects but doing a 'lite' marshal (with simple inferring of perms as before) takes:
Loading whole Shape objects, using omero-marshal to marshal all except omero:details
(inferred by user-id and group-id as before - https://github.com/will-moore/omero-web/commit/5a8e4928ddb9e92f06e1c3b7e3de317b2479f045) takes:
@jburel It's probably time we discussed this investigation so far and decide on what direction to take (if any)? There seem to be some slight improvements we can make in loading speed, with potentially less-accurate permissions if we calculate them client-side (could be improved with more logic). If we can load more ROIs up-front (instead of paging) then the experience is better (as long as they don't take too long to load at the start). So we could consider raising the "do-pagination" threshold. Another option we've not explored yet is loading by pagination, but pro-actively load all the pages at the start (up to some limit). That could improve the time to showing something, but also have all ROIs in hand.
I guess, to decide on anything we need a good test suite of different sizes of images with different numbers of ROIs & Shapes. Maybe just deciding on those numbers and how we're going to create them is a good next step?
Agree we need to discuss the evolution of the ROI section(s) I have other use cases I want to discuss too.
Improvement discussed on 03/09 with @joshmoore @will-moore @sbesson @pwalczysko.
Look at permissions loading in omero-marshal
This issue has been mentioned on Image.sc Forum. There might be relevant details there:
https://forum.image.sc/t/no-mask-found-error-in-omero-rois/49746/5
The tool is totally unusable when an image has a large number of ROIs with several shapes.
Basically there is no way to follow one roi over time when playing the movie cc @pwalczysko @will-moore