ome / omero-scripts

Core OMERO Scripts
https://pypi.org/project/omero-scripts/
12 stars 32 forks source link

batch export ROIs #139

Closed will-moore closed 6 years ago

will-moore commented 6 years ago

See https://trello.com/c/L6ue1oKe/115-rfe-roi-analysis

Provides ability to export ROI intensities from many images in a single csv file. To test:

cc @waxenegger @pwalczysko

screen shot 2018-01-30 at 23 23 51
jburel commented 6 years ago

Could you add an integration test for that new script? This will be automatically tested by travis

jburel commented 6 years ago

We need to unify the naming of columns otherwise it will become difficult for external to use any of the files for further analysis. iviewer followed what insight had in place One point of discussion for iviewer was the ability to open correctly in excel, we need to also review that https://trello.com/c/Vrn2uD21/1-csv-format-discussion

jburel commented 6 years ago

As mentioned on a previous PR, I am not a big fan of the print statement in various places of the scripts. If we want to introduce info for debugging let's use a logger with the ability to turn on/off those statements

jburel commented 6 years ago

Ideally iviewer should use that script when available (will first need to check in iviewer if the script is installed otherwise fall back on existing code) Area and length are also included in iviewer

will-moore commented 6 years ago

I'm not sure it makes sense for iviewer to use this script unless there are so many ROIs that the existing approach starts to take too long (many seconds). The overhead of starting the script, pinging for completion, creation of a File Annotation etc is not worth it in most cases, especially if you're only getting ROIs from a single image. Maybe need to decide if there's a reasonable cut-off (number of shapes) above which we need to use the script?

will-moore commented 6 years ago

I'm not sure at the moment how iviewer is calculating the area and length, but they're not coming from the server - probably ol3 geometry is providing them? Also, I wasn't sure what length means when applied to a Rectangle or Ellipse? Circumference maybe? Is that really useful? I thought about adding coordinates to the csv, e.g, x, y, width, height, radiusx, radiusy, x1, y1, x2, y2, would cover most except polygon & polyline.

jburel commented 6 years ago

area and length are calculated client side in iviewer. For iviewer using the script, I agree that we will need to check the overhead. This is more to centralise so we know that everything is in synch

jburel commented 6 years ago

@will-moore the script is currently tested on CI master.

jburel commented 6 years ago

workflows branch is old, probably b/c we did not tag every time we did not every time Let me push force push and updated version

will-moore commented 6 years ago

Hmmm - I think some messages crossed in the post there. I just rebased back to workflows but travis is failing quite badly...

waxenegger commented 6 years ago

@jburel iviewer is using the very same call (getShapeStatsRestricted). Why change? I'd have to handle a file annotation as a result, would have to iterate over it yet again and add area/length via shape lookup. Furthermore the script needs to take the selected shape ids as input as well.

jburel commented 6 years ago

Have you tested with script with large amount of ROIS/shapes?

jburel commented 6 years ago

Discussed today @will-moore will compare the output of the script with external tools e.g. ImageJ

will-moore commented 6 years ago

Comparison between ImageJ and this script shows good correlation between Rectangle but some differences with Ellipse, due to differences in deciding which points are included. Excell compared with ImageJ ROI manager:

screen shot 2018-02-01 at 13 45 18

will-moore commented 6 years ago

Exported ROIs from a Dataset of 51 images, with lots of ROIs and Shapes. Many shapes without Z/T set and "propagated" through all Z/T sections, resulting in a csv with 6098 rows. So it seems to handle lots of ROIs/Shapes OK.

joshmoore commented 6 years ago

@will-moore : re -k -- :+1: but this file is slated for deletion, so be sure to open it against the upstream as well.

will-moore commented 6 years ago

@josh Currently I'm writing and running tests locally using

$ cd scripts
$ python setup.py test -t test/integration -k test_batch_roi_export

Will this still work when setup.py is deleted? If not, that would be a real pain.

joshmoore commented 6 years ago

Will this still work when setup.py is deleted?

This repository will need to import it from upstream. See the discussion in https://github.com/openmicroscopy/openmicroscopy/pull/5636

will-moore commented 6 years ago

@joshmoore Sorry - I not sure how that discussion answers my question? Sebastien seems to be suggesting that things move down to e.g. -> scripts, but you're saying that setup.py is moving 'upstream'? So, will this look like:

$ cd scripts
$ cp .omero/setup.py ./
$ python setup.py test -t test/integration -k test_batch_roi_export
joshmoore commented 6 years ago

Sorry, I was unclear. Whatever repository that PyTest lives in, this repository will need to make it a dependency. So the class will be removed from setup.py but will still be importable.

pwalczysko commented 6 years ago

Bug: Created a couple of ROIs (using Insight, ROIs with and without propagation) on an image http://10.0.51.135:32775/web/webclient/?show=image-28 (login as user-4). Into the input box of the channels entered 1,2,3,4 I know that this is a valid input, because the script ran with this exact input on another image previously. But I got on this image

WARNING:omero.gateway:ApiUsageException on <class 'omero.gateway.OmeroGatewaySafeCallWrapper'> to <f061b02f-9a3c-4cff-b39b-4b468b1dd645omero.api.IRoi> getShapeStatsRestricted(([145L], 20L, 0L, []), {})
Traceback (most recent call last):
  File "/home/omero/workspace/OMERO-server/OMERO.server/lib/python/omero/gateway/__init__.py", line 4598, in __call__
    return self.f(*args, **kwargs)
  File "/home/omero/workspace/OMERO-server/OMERO.server/lib/python/omero_api_IRoi_ice.py", line 826, in getShapeStatsRestricted
    return _M_omero.api.IRoi._op_getShapeStatsRestricted.invoke(self, ((shapeIdList, zForUnattached, tForUnattached, channels), _ctx))
ApiUsageException: exception ::omero::ApiUsageException
{
    serverStackTrace = ome.conditions.ApiUsageException: Provide a non empty array of channels.
...

Full stack trace see team/pwalczysko/ROI-export

Edit: The problem was that my delimitor (should have been comma) was wrong when putting the channels in. This is strict - there is some help as question mark, but when you make a typo, you got no warning.

pwalczysko commented 6 years ago

Discussing with @bramalingam Re channel input

pwalczysko commented 6 years ago

So checking All planes just gives all planes results for the shapes which are not bound to a z, t. But actually, as a naive user, I think the gut feeling would be that it should give all planes irrespective of whether the shape is bound to z, t or not. Am I now expected to go into iviewer (note, I used insight, so I even cannot do it) and unset all the z, t on the planes if I want the values from all planes ? Especially when I do not have an idea about the unbound and bound subtleties, I will get really no chance to figure it out ?

jburel commented 6 years ago
pwalczysko commented 6 years ago

RFE: Attach the csv to the Dataset when Dataset is used as an input. (atm, the csv is attached to the last image (which in my case did not even have any ROIs on it)

pwalczysko commented 6 years ago

@jburel

This could become tricky if we allow all those options

I would not allow all options - just name of the channel.

The checkboxes approach will not really scale if you have a large number of them

But the present approach does not scale either, even worse, I would have to write out 1,2,3,...50 and not make a single typo.

Edit: @bramalingam suggests a dropdown where the all channels is first option would be a solution..

pwalczysko commented 6 years ago

So the script behaves as expected on ROIs drawn in insight just couple of RFEs or points to discuss

  1. https://github.com/ome/scripts/pull/139#issuecomment-364214317
  2. https://github.com/ome/scripts/pull/139#issuecomment-364215966
  3. https://github.com/ome/scripts/pull/139#issuecomment-364222091

I could not use iviewer, because it is not installed on workflows devspace.

pwalczysko commented 6 years ago

I do not see the test_batch_ROI_export in the last successful build of the devspace https://10.0.51.135:32768/job/OMERO-test-integration/12/testReport/OmeroPy.test.integration.scriptstest.test_export_scripts/TestExportScripts/ ?

jburel commented 6 years ago

@pwalczysko you won't, the tests are run by travis only

test/integration/test_export_scripts.py::TestExportScripts::test_batch_roi_export[True] PASSED [ 16%] test/integration/test_export_scripts.py::TestExportScripts::test_batch_roi_export[False] PASSED [ 20%]

jburel commented 6 years ago

@will-moore what is the status of the script? This will not be included in the 5.4.4 release

will-moore commented 6 years ago

Needs some discussion about options for choosing channels and which planes to export.

The "All Planes" option doesn't exist in iviewer itself, so we could just remove this and NOT export pixel intensities for shapes without Z/T set, as iviewer does? This would only be an issue if e.g. you wanted to get a FRAP (intensity over Time) plot from a single shape. Using Insight, this would still be possible by propagating shapes across Z/Time which we should try to support in iviewer soon too.

I think that entering channel Names is going to be even more error-prone than entering indexes. Even tiny typos etc would cause problems. If the default value was 1,2,3,4 then this would make it clearer to the user the expected format, and if they changed nothing then up to 4 channels would be exported. However, it's tricky for the script to specify this as a default when using the scripts.List parameter because we can't set default as "1, 2, 3, 4" or even [1, 2, 3, 4] since we get this error on upload

  File "/Users/wmoore/Desktop/OMERO/openmicroscopy/dist/lib/python/omero/scripts.py", line 273, in ofType
    % (unwrap(obj), unwrap(self.prototype.val[0])))
ValueError: ofType values doesn't match default value: 0 <> 1

But this could be achieved by using a scripts.String parameter instead of a list, but then we don't get the (i) hint in the UI to "Add items to list, separated by ,"

pwalczysko commented 6 years ago

However, it's tricky for the script to specify this as a default when using the scripts.List parameter because we can't set default as "1, 2, 3, 4" or even [1, 2, 3, 4] since we get this error on upload

File "/Users/wmoore/Desktop/OMERO/openmicroscopy/dist/lib/python/omero/scripts.py", line 273, in ofType % (unwrap(obj), unwrap(self.prototype.val[0]))) ValueError: ofType values doesn't match default value: 0 <> 1

But this could be achieved by using a scripts.String parameter instead of a list, but then we don't get the (i) hint in the UI to "Add items to list, separated by ,"

Couldn't we somehow get both the 1,2,3,4 as default as well as the info hint ?

joshmoore commented 6 years ago

I've probably missed a few steps here, but I would expect that a given list could be used as the default based on the slice. There might be a limitation with the wrapper library which might be a fairly simple issue to either fix or workaround.

will-moore commented 6 years ago

I'll remove b78401b92d60b3ab03d6940e8b0e94d39dd51e69 from the git history since we don't need it for this PR, but might be useful in future for adding a summary of ROI intensities for an image as a Map Annotation on each Image.

will-moore commented 6 years ago

So, it turns out to be my mistake, using integers in the default [1,2,3,4] when I should have used [1L, 2L, 3L, 4L]

will-moore commented 6 years ago

However, this needs a fix in webclient to be displayed correctly: https://github.com/openmicroscopy/openmicroscopy/pull/5676

will-moore commented 6 years ago

In order to test on web-dev-merge, I'll rebase this against develop again...

rgozim commented 6 years ago

Tested working on web-dev-merge.

pwalczysko commented 6 years ago

The script still attaches the result just to the last image in the dataset. Even though the Dataset was selected. I do not find it fully justified. The last image in my dataset does not even have any ROIs.

Edit: The script attaches the csv to all images in the whole dataset, whether they have ROIs or not. I would appreciate if the csv could be attached to the Dataset, if the Dataset was passed to the script.

pwalczysko commented 6 years ago

screen shot 2018-04-27 at 14 21 14

The script does not respect the association with Channels. This means, that although my ROIs are associated with channels, the script will export the results for all ROIs on all channels. That is not justified. Strangely, iviewer does the same thing, but I never liked it. The values for a particular T and Z are exported here only for those ROIs which are associated with this T and Z. Why are channels different ? Although I have selected all channels, I should get only appropriate ROIs for appropriate channels, not All ROIs for All Channels.

pwalczysko commented 6 years ago

The script successfully crunched a dataset with approx 10000 ROIs. The Script dialog is largely improved, no need to type 1,2.. and the tooltip is present too.

The typos indicated by @jburel and @mtbc seem to be fixed too.

The only two issues left are

  1. https://github.com/ome/scripts/pull/139#issuecomment-384967301 (attach csv to Dataset)
  2. https://github.com/ome/scripts/pull/139#issuecomment-384967301 (respect ROI association with Channels)
mtbc commented 6 years ago

Out of curiosity, did you test from idr-experimental? I have been trying to figure out what a long-running transaction could have been that did a bunch of stuff like,

select rois0_.image as image60_1_, rois0_.id as
 id1_, rois0_.id as id126_0_, rois0_.description as descript2_126_0_, rois0_.cre
ation_id as creation6_126_0_, rois0_.external_id as external7_126_0_, rois0_.gro
up_id as group8_126_0_, rois0_.owner_id as owner9_126_0_, rois0_.permissions as 
permissi3_126_0_, rois0_.update_id as update10_126_0_, rois0_.image as image126_
0_, rois0_.name as name126_0_, rois0_.source as source126_0_, rois0_.version as 
version126_0_ from roi rois0_ where rois0_.image=$1

(And since 14:31:29 there's another.)

pwalczysko commented 6 years ago

Out of curiosity, did you test from idr-experimental?

No, I did not.

mtbc commented 6 years ago

Ah, thanks, sorry for the noise then, something else must be doing long-running ROI-related query sessions!

will-moore commented 6 years ago

@pwalczysko The other day I was thinking of the use case where e.g. I want to know how much of Channel 2 intensity is inside the Nucleus (Channel 1 ROI) so I'd want to be able to export intensities for any / all channels for a shape, regardless of which channel it is linked to. At the time I couldn't remember whether this script or iviewer would support that use case but it seems that it currently does.

I'll link the csv to Dataset if it was run on a Dataset....

pwalczysko commented 6 years ago

I'll link the csv to Dataset if it was run on a Dataset....

This works now.

After a conversation with @will-moore regarding the prep/usage of this script for User's meeting, it seems that there will be some more changes incoming here:

Happy for this to be merged and the above points to be done in a follow-up PR

jburel commented 6 years ago

From last week discussion, the script for the meeting will go under https://github.com/ome/training-scripts We can later on decide to "promote" it to this repo with various options (for meeting the target is map annotations for the script)