ome / omero-scripts

Core OMERO Scripts
https://pypi.org/project/omero-scripts/
12 stars 32 forks source link

Batch_Image_Export supports SPW + Project #174

Closed will-moore closed 4 years ago

will-moore commented 4 years ago

Similar to #156 this adds SPW (and Project) support to Batch_Image_Export.py. I needed this recently to export a lot of images from a local copy of idr0002 (1 plate at a time).

To test:

manics commented 4 years ago

Is there a way to limit the filesize used by tempfiles? https://github.com/ome/omero-web/issues/118 is an example of the sort of problems we're already running into on production.

will-moore commented 4 years ago

I don't know about any file size limit coming from Django (couldn't see anything obvious with a quick search) but we should be able to teach our code to limit the size of file we attempt to write. But this script code doesn't try to create a temp file on the web server (like you're seeing at https://github.com/ome/omero-web/issues/118).

jburel commented 4 years ago

I have selected 2 datasets (user-1) one with 2252 images and one with 4000 images The changes are potentially dangerous since one could go to the web UI and select all the projects or screens and attempt a massive export and potentially run the server to its knee

jburel commented 4 years ago
Screenshot 2020-03-31 at 12 55 00
will-moore commented 4 years ago

So, what do you suggest? As you showed above, it's already possible to select "too much" data without any of the changes in this PR.

The SPW support I've added here is simply some code that I found useful, when preparing some test data (e.g. export from idr0002 then re-import to create https://merge-ci.openmicroscopy.org/web/webclient/?show=plate-16105 user-3) so better to merge it in than to keep a local copy?

jburel commented 4 years ago

I agree that it is dangerous as it is. I do not dispute that it can be useful in some cases but allowing Project and Screen, Plate make it even more obvious The limit will apply when we have only selected images for sure. Need to look at the code in more details to see what could be a suitable limit

pwalczysko commented 4 years ago

Tested this script on merge-ci on

Exported ome-tiffs. It is fairly slow, but I suppose this is to be expected. Works fine. I guess the discussion here is going to focus on the limitation of the server overload danger though.

joshmoore commented 4 years ago

Are there any simple checks that we could put into place?

will-moore commented 4 years ago

You could set a maximum number of images you're allowed to export. Since all the containers (Project, Dataset, Screen, Plate) result in a list of images, you could just truncate the list at some cutoff. The only question is what the cutoff should be and how it is configured. It seems overkill to add a server setting for this use-case. Could test with a bunch of numbers and pick something that works for us, e.g. 1000 images?

jburel commented 4 years ago

The number of images is arbitrary. if the user select 1000 "fish", we are in trouble

will-moore commented 4 years ago

Closing for now since there's no easy solution and this was only needed for a one-off prep of dummy data for parade demo.