ome / omero-blitz

Gradle project containing Ice remoting code for OMERO
https://www.openmicroscopy.org/omero
GNU General Public License v2.0
0 stars 15 forks source link

Have DuplicateI able to handle underlying files #100

Closed mtbc closed 3 years ago

mtbc commented 3 years ago

In addressing https://github.com/ome/design/issues/20 extend DuplicateI so that it can now handle various underlying files, allowing fuller duplication of images among other things.

Functional testing:

Code review: I aimed for minimal changes to the repository implementations. A larger review of API and its usage could follow.

Subsequent PRs could include configurability for:

pwalczysko commented 3 years ago

Bug

(env562) pwalczysko@ls31618~$ omero login -u user-2
Server: [localhost:4064]
Password:
Created session for user-2@localhost:4064. Idle timeout: 10 min. Current group: private-2
(env562) pwalczysko@ls31618~$ 
(env562) pwalczysko@ls31618~$ 
(env562) pwalczysko@ls31618~$ omero duplicate Dataset:1
Using session for user-2@localhost:4064. Idle timeout: 10 min. Current group: private-2
omero.cmd.Duplicate Dataset:1 failed: 'graph-fail'
    message=Currently logged into group 1. Cannot alter object in group 3
    stacktrace=ome.conditions.SecurityViolation: Currently logged into group 1. Cannot alter object in group 3
    at ome.security.basic.OmeroInterceptor.managedGroup(OmeroInterceptor.java:1080)

Possibly just a problem of the cli plugin ? I guess this will be exposed though as the feature is definitely going to be used in cross-group contexts.

Edit: No, this is a bug. I cannot work around this easily. Note that the group with id 1 is the user group in omero. I cannot log in to a better group to "fix" this and thus it seems to be a blocker.

Edit 2: Excerpt from Blitz log

``` 2020-09-01 16:58:27,476 INFO [ ome.services.util.ServiceHandler] (l.Server-8) Meth: interface ome.api.JobHandle.setStatus 2020-09-01 16:58:27,476 INFO [ ome.services.util.ServiceHandler] (l.Server-8) Args: [Error] 2020-09-01 16:58:27,477 INFO [ ome.security.basic.BasicEventContext] (l.Server-8) cctx: group=0 2020-09-01 16:58:27,482 INFO [ ome.security.basic.EventHandler] (l.Server-8) Auth: user=0,group=0,event=10055(User),sess=55ecabb2-744a-465a-8c70-73967a9246c0 2020-09-01 16:58:27,485 INFO [ ome.security.basic.CurrentDetails] (l.Server-8) Adding log:UPDATE,class ome.model.jobs.ParseJob,868 2020-09-01 16:58:27,489 INFO [ org.perf4j.TimingLogger] (l.Server-8) start[1598975907476] time[12] tag[omero.call.success.ome.services.JobBean.setStatus] 2020-09-01 16:58:27,489 INFO [ ome.services.util.ServiceHandler] (l.Server-8) Rslt: Running 2020-09-01 16:58:27,490 INFO [ ome.services.util.ServiceHandler] (l.Server-6) Meth: interface ome.api.StatefulServiceInterface.close 2020-09-01 16:58:27,490 INFO [ ome.services.util.ServiceHandler] (l.Server-6) Args: () 2020-09-01 16:58:27,490 INFO [ ome.security.basic.BasicEventContext] (l.Server-6) cctx: group=0 2020-09-01 16:58:27,491 INFO [ ome.security.basic.EventHandler] (l.Server-6) Auth: user=0,group=0,event=null(User),sess=55ecabb2-744a-465a-8c70-73967a9246c0 2020-09-01 16:58:27,491 INFO [ org.perf4j.TimingLogger] (l.Server-6) start[1598975907490] time[1] tag[omero.call.success.ome.services.JobBean.close] 2020-09-01 16:58:27,491 INFO [ ome.services.util.ServiceHandler] (l.Server-6) Rslt: null ```
mtbc commented 3 years ago

Does it work if you first log into that group with -g?

pwalczysko commented 3 years ago

Does it work if you first log into that group with -g?

No, please see my edit of https://github.com/ome/omero-blitz/pull/100#issuecomment-684973243

mtbc commented 3 years ago

Very odd. I wonder why I never ran into this locally. I'll investigate. Could you provide the full stack trace around the error from Blitz.log, ideally something mentioning DuplicateI somewhere!

pwalczysko commented 3 years ago

Following workflow circumvented the error in https://github.com/ome/omero-blitz/pull/100#issuecomment-684973243

  1. Use OMERO.insight to import as user-2 into the group private-2 (the user and group which ran into problems in comment https://github.com/ome/omero-blitz/pull/100#issuecomment-684973243)
  2. After a successful import, try duplicating the freshly imported Dataset, see below -success
(env562) pwalczysko@ls31618~$ omero duplicate Dataset:303
Previously logged in to localhost:4064 as user-2
Server: [localhost:4064]
Username: [user-2]
Password:
Created session for user-2@localhost:4064. Idle timeout: 10 min. Current group: private-2
omero.cmd.Duplicate Dataset:303 ok
(env562) pwalczysko@ls31618~$ 

This is not fixing the problem, but gives some insights hopefully.

Edit:

When trying to duplicate the "old" dataset (id 1, as in comment https://github.com/ome/omero-blitz/pull/100#issuecomment-684973243 , I get the same error as in comment https://github.com/ome/omero-blitz/pull/100#issuecomment-684973243 - thus, the import fixed only the freshly imported dataset behaviour, not the old dataset duplication

mtbc commented 3 years ago

Interesting question is why Dataset:1 fails (apparently it still does) but Dataset:303 works. I hadn't seen that error before.

mtbc commented 3 years ago

Tomorrow I'll make an artifact with #98 included to try to get more information.

Update: Done.

mtbc commented 3 years ago

It looks as though the user group is returned by CurrentDetails.getGroup() at that moment.

pwalczysko commented 3 years ago

The problem with the Dataset:1 duplication and error reporting the group to be wrong https://github.com/ome/omero-blitz/pull/100#issuecomment-684973243 is fixed now.

pwalczysko commented 3 years ago

For now, tested duplication of P/D/I in one-owner hierarchies. All gets duplicated, also ROIs and attachments. The images when deleted, do not influence the copies, both when the original is deleted or the copy.

Brilliiant.

Will try to test more complex setups like

tomorrow.

pwalczysko commented 3 years ago

User has to be logged into the correct group to attempt a duplicate using the CLI plugin. Is that a feature of the CLI plugin or of DuplicateI ?

RA group:

Simple members

RO group:

Simple members

P group

Simple members

Subsequent moves of the duplicated objects into another group by duplicators was successful as expected.

Do not see a mistake in that.

mtbc commented 3 years ago

User has to be logged into the correct group ...

Very probably DuplicateI has always been like this, have now noted the issue in #103, thank you.

Checking the DB and filesystem:

If you use the web client right-pane for checking where in the managed repository an image's files are, you should see from there that the duplicate is not where the original is. But if you do du -s /OMERO/ManagedRepository/ you should find that the space used does not greatly increase from before to after duplication even for large files (at least in comparison to reimporting) and that an ls -l on one of the original's files has the first number, after the permissions, increment as a result of the duplication. (The number counts hard links, kind of.)

Whereas, you should see du -s /OMERO/Files/ grow from duplicating attachments or whatever by roughly the size of the attachments.

In psql compare the output of SELECT path, name, repo FROM pixels WHERE image = 12345; (some image ID) for the original and the duplicate to make sure the path looks the same except for being in a new managed repository folder that corresponds to the change you saw in web above.

pwalczysko commented 3 years ago
(env562) pwalczysko@ls31618~$ omero duplicate Image:562,563 --report
Using session for user-3@localhost:4064. Idle timeout: 10 min. Current group: read-annotate-1
omero.cmd.Duplicate Image:562,563 failed: 'graph-fail'
failed: within Fileset[508] may not duplicate Image[562] without Image[560] also
Steps: 14
Elapsed time: 0.394 secs.
Flags: [FAILURE, CANCELLED]

Duplicating a MIF - probably also precedes this PR - it would be nice to have a "duplicate that MIF, pick the images as necessary" option (kind of Move All, as implemented for MiFs in web). The workflow is tedious also because it does not tell you all the images necessary, instead leaves you to re-run the command as many times as there are images inside the MIF, always reporting the failure on the next subsequent image only, it is up to the user to add them all into one single command.

pwalczysko commented 3 years ago

I can see that the duplicates are in folders near to the original, but with subfolder with naming derived by date, such as.

(base) pwalczysko@ls31618~/var/561/ManagedRepository/user-3_4/2020-09/03/17-48-43.124$ ls -lah
total 4216
drwxr-xr-x   4 pwalczysko  1133848969   128B Sep  3 17:48 .
drwxr-xr-x  22 pwalczysko  1133848969   704B Sep  3 18:28 ..
-rw-r--r--   2 pwalczysko  1133848969   2.1M Sep  3 17:48 F8wk_8291_F1.vsi
drwxr-xr-x   5 pwalczysko  1133848969   160B Sep  3 17:48 _F8wk_8291_F1_

for original and

(base) pwalczysko@ls31618~/var/561/ManagedRepository/user-3_4/2020-09/04/12-07-10.972$ ls -lah
total 4216
drwxr-xr-x  4 pwalczysko  1133848969   128B Sep  4 12:07 .
drwxr-xr-x  4 pwalczysko  1133848969   128B Sep  4 12:07 ..
-rw-r--r--  2 pwalczysko  1133848969   2.1M Sep  3 17:48 F8wk_8291_F1.vsi
drwxr-xr-x  5 pwalczysko  1133848969   160B Sep  4 12:07 _F8wk_8291_F1_

for duplicate.

Indeed, the number of hard links increased by one after the duplication. I could not detect any increase in the size of ManagedRepo at all.

(base) pwalczysko@ls31618~$ du -s ~/var/561/ManagedRepository/
2600096 /Users/pwalczysko/var/561/ManagedRepository/
(base) pwalczysko@ls31618~$ du -s -h ~/var/561/ManagedRepository/
1.2G    /Users/pwalczysko/var/561/ManagedRepository/
(base) pwalczysko@ls31618~$ du -s -h ~/var/561/ManagedRepository/
1.2G    /Users/pwalczysko/var/561/ManagedRepository/
(base) pwalczysko@ls31618~$ du -s ~/var/561/ManagedRepository/
2600096 /Users/pwalczysko/var/561/ManagedRepository/
(base) pwalczysko@ls31618~$ 
pwalczysko commented 3 years ago
561=# SELECT path, name, repo FROM pixels WHERE image = 562; 
               path               |       name       |                 repo                 
----------------------------------+------------------+--------------------------------------
 user-3_4/2020-09/03/17-48-43.124 | F8wk_8291_F1.vsi | 65d22ab6-4e09-4269-85cc-56fc9a963a07
(1 row)

for original

561=# SELECT path, name, repo FROM pixels WHERE image = 592; 
               path               |       name       |                 repo                 
----------------------------------+------------------+--------------------------------------
 user-3_4/2020-09/04/12-07-10.972 | F8wk_8291_F1.vsi | 65d22ab6-4e09-4269-85cc-56fc9a963a07
(1 row)

for duplicate

These paths are fairly similar, as indicated above. They match the report of the webclient and also they match what I found in ManagedRepo in https://github.com/ome/omero-blitz/pull/100#issuecomment-687087858

mtbc commented 3 years ago

failed: within Fileset[508] may not duplicate Image[562] without Image[560] also

The workflow is tedious also because it does not tell you all the images necessary, instead leaves you to re-run the command as many times as there are images inside the MIF, always reporting the failure on the next subsequent image only, it is up to the user to add them all into one single command.

It should work to duplicate Fileset[508] instead of trying to list all the images separately.

pwalczysko commented 3 years ago

It should work to duplicate Fileset[508] instead of trying to list all the images separately.

Great, thank you. I think an obvious RFE (for the CLI client ?) would be to have the hint you just gave me in the output of the error - do you have an idea where should I create an issue or what needs to be changed in code to get such helpful hint ?

mtbc commented 3 years ago

This is a new aspect of the broader https://trello.com/c/F4cAE0k3/358-graphexception-subclasses - between the CLI, QA and UX boards, Trello captures many of the user-facing issues with graphs behavior.

(Basically, far better to use the type system to communicate helpful information in a structured way that clients can easily interpret. Faffing with these textual messages generated from deep in the server is never going to be the route to happy users; it's a bug if they have to see them at all because they're not being shown anything better!)

pwalczysko commented 3 years ago

Duplicating in-place images principally works as expected.

  1. Duplicate of one image with ca 2000 ROIs returns within 4 minutes on merge-ci. The new image is in expected place and is highlighted as being in-place imported. The paths on server is user-3_454/pool-1-thread-4/2020-09/09/09-12-13.745/B4_C1.tif
  2. Duplicate of a dataset with 3 images and 20 ROIs returns within less than 10 sec. The Dataset is at expected place, the ROIs are okay and owned by duplicator (user-3)
  3. Duplicate of a whole Dataset. with some 35000 ROIs on 15 images did not return overnight and failed probably because of lost connection.
  4. Duplicate of two images with some 5400 ROIs between them returns within some 10-20 minutes on merge-ci
pwalczysko commented 3 years ago

Considering Ad 3 above, some option of --exclude rois on the duplicate command might come handy in these situations, for reducing the waiting times. I suppose though this can be done using the --exclude annotations flag though, but obviously this would be another case for its usage.

mtbc commented 3 years ago

Suggested further testing:

  1. Retest independence (moving, deleting, etc.) of original and duplicate images and attachments.
  2. Retest path and name from originalfile and pixels.
  3. See that copied files (attachments, tables, thumbnails, etc.) are now reported at INFO level in the Blitz log, e.g.,

    file bytes copied: 2,512 into FILES, 2,740 into THUMBS

pwalczysko commented 3 years ago
(env562) pwalczysko@ls31618~$ omero duplicate Dataset:4420 --report
Previous session expired for user-3 on merge-ci-devspace.openmicroscopy.org:4064
Server: [merge-ci-devspace.openmicroscopy.org:4064]
Username: [user-3]user-9
Password:
Created session for user-9@merge-ci-devspace.openmicroscopy.org:4064. Idle timeout: 10 min. Current group: read-only-1
omero.cmd.Duplicate Dataset:4420 failed: 'graph-fail'
failed: may not duplicate Channel[184479] without LogicalChannel[139477] in a read-only group
Steps: 14
Elapsed time: 6.268 secs.
Flags: [FAILURE, CANCELLED]
(env562) pwalczysko@ls31618~$ 

@mtbc would you please know why is this behavour occurring ?

Edit:

The images which perform this behaviour are

(env562) pwalczysko@ls31618~$ omero duplicate Image:64123,64121 --report
Using session for user-9@merge-ci-devspace.openmicroscopy.org:4064. Idle timeout: 10 min. Current group: read-only-1
omero.cmd.Duplicate Image:64121,64123 failed: 'graph-fail'
failed: may not duplicate Channel[184479] without LogicalChannel[139477] in a read-only group
Steps: 14
Elapsed time: 1.154 secs.
Flags: [FAILURE, CANCELLED]
(env562) pwalczysko@ls31618~$ 

They might be a product of a script, but I thought the one of them is a product of another. They elicit the error even singly, additionally to erroring when passed into one command as above.

mtbc commented 3 years ago

Is there perhaps an image that was projected from and put in a different dataset or somesuch? My guess is very much that there is some "intimately linked" image not included in the duplication. Does chgrp --dry-run also fail?

mtbc commented 3 years ago

Maybe try it with images that you know for sure has one being the product of the other and no other images are involved?

pwalczysko commented 3 years ago
(env562) pwalczysko@ls31618~$ omero chgrp read-annotate-1 Image:64123 --report --dry-run
Using session for user-9@merge-ci-devspace.openmicroscopy.org:4064. Idle timeout: 10 min. Current group: read-only-1
omero.cmd.Chgrp2 Image:64123 failed: 'graph-fail'
failed: may not move Image[64123] while Image[96752] remains as they share Instrument[5203]

It is probably as you suspect. The chgrp on the image shows a more friendly message though. I have verified that this info for chgrp is also "piecemeal", but we discussed that already. Nevertheless, the chgrp has the edge of talking about missing image and passing me the id of the one which is linked

pwalczysko commented 3 years ago

Retest as per https://github.com/ome/omero-blitz/pull/100#issuecomment-691717256

Duplication of an image with tags, attachments, comments, ratings and KVPs went as expected. The subsequent move into aoother group showed that the entities are independent from the original.

Checking of paths

For the original

OMERO-server=# SELECT path, name, repo FROM pixels WHERE image = 64122; 
                                path                                 |      name
       |                 repo                 
---------------------------------------------------------------------+----------
-------+--------------------------------------
 user-3_454/Blitz-0-Ice.ThreadPool.Server-28/2019-07/17/15-35-49.320 | FURA2-2CH
-T.tif | e0141a37-c22a-46fa-86cd-6433f5a104c5
(1 row)

For the copy

OMERO-server=# SELECT path, name, repo FROM pixels WHERE image =  143965; 
                        path                        |      name       |         
        repo                 
----------------------------------------------------+-----------------+---------
-----------------------------
 user-9_460/pool-1-thread-2/2020-09/14/12-46-20.547 | FURA2-2CH-T.tif | e0141a37
-c22a-46fa-86cd-6433f5a104c5
(1 row)

This matches with the report of OMERO.web RHP.

The report in the Blitz log seems to be okay too, see below (did copying on one image with one FileAttachment on two occasions, the timestamp matches except merge-ci seems to be on GMT, not BST.

(.venv3) bash-4.2$ cat /home/omero/workspace/OMERO-server/OMERO.server/var/log/Blitz-0.log | grep "file bytes copied"
2020-09-14 12:46:23,431 INFO  [             omero.cmd.graphs.DuplicateI] (1-thread-2) file bytes copied: 35,331,558 into FILES, 2,010 into THUMBS
2020-09-14 14:11:59,594 INFO  [             omero.cmd.graphs.DuplicateI] (1-thread-5) file bytes copied: 35,331,558 into FILES, 2,010 into THUMBS
pwalczysko commented 3 years ago
(env562) pwalczysko@ls31618~$ omero login -g read-only-1
Previously logged in to merge-ci-devspace.openmicroscopy.org:4064 as user-8
Server: [merge-ci-devspace.openmicroscopy.org:4064]
Username: [user-8]
Password:
Created session for user-8@merge-ci-devspace.openmicroscopy.org:4064. Idle timeout: 10 min. Current group: read-only-1
(env562) pwalczysko@ls31618~$ 
(env562) pwalczysko@ls31618~$ 
(env562) pwalczysko@ls31618~$ omero duplicate Screen:2954 --report
Using session for user-8@merge-ci-devspace.openmicroscopy.org:4064. Idle timeout: 10 min. Current group: read-only-1
omero.cmd.Duplicate Screen:2954 failed: 'graph-fail'
failed: failed to duplicate file from /home/omero/omero-server-data/Files/Dir-239/239804 to /home/omero/omero-server-data/Files/Dir-539/539020: java.nio.file.NoSuchFileException: /home/omero/omero-server-data/Files/Dir-539/539020
Steps: 14
Elapsed time: 29.27 secs.
Flags: [FAILURE, CANCELLED]
(env562) pwalczysko@ls31618~$ 

Interestingly, the above screen duplicates fine in case the owner of the read-only-1 group does it.

Note though that this is probably not a principial problem with plates, see

(env562) pwalczysko@ls31618~$ omero duplicate Plate:4256 --report
Using session for user-8@merge-ci-devspace.openmicroscopy.org:4064. Idle timeout: 10 min. Current group: read-only-1
omero.cmd.Duplicate Plate:4256 ok
Steps: 14
Elapsed time: 20.757 secs.
Flags: []
Duplicates
  Detector:25056
  DetectorSettings:25256
  ImagingEnvironment:33604-33653
  Instrument:26256
  Objective:26854
  ObjectiveSettings:4903
  ImageAnnotationLink:30077
  LongAnnotation:58056
  Channel:369936-369985
  Image:143999-144048
  LogicalChannel:205322
  OriginalFile:539109-539210
  Pixels:143147-143196
  PlaneInfo:382994-383043
  ChannelBinding:335080-335129
  QuantumDef:133744-133793
  RenderingDef:133744-133793
  Thumbnail:133193-133242
  Experiment:24104
  Fileset:85661
  FilesetEntry:386237-386337
  FilesetJobLink:332251-332255
  IndexingJob:358371
  JobOriginalFileLink:115001
  MetadataImportJob:358374
  PixelDataJob:358372
  ThumbnailGenerationJob:358375
  UploadJob:358373
  Plate:24357
  PlateAcquisition:24254
  Well:54336-54345
  WellSample:81036-81085
  StatsInfo:312743-312792
(env562) pwalczysko@ls31618~$ 
mtbc commented 3 years ago

I'll push a commit that may fix this.

pwalczysko commented 3 years ago

The retest of the bug https://github.com/ome/omero-blitz/pull/100#issuecomment-692114899 after last commit is passing

``` Created session for user-8@merge-ci-devspace.openmicroscopy.org:4064. Idle timeout: 10 min. Current group: read-only-1 (env562) pwalczysko@ls31618~$ (env562) pwalczysko@ls31618~$ (env562) pwalczysko@ls31618~$ omero duplicate Screen:2954 --report Using session for user-8@merge-ci-devspace.openmicroscopy.org:4064. Idle timeout: 10 min. Current group: read-only-1 omero.cmd.Duplicate Screen:2954 ok Steps: 14 Elapsed time: 36.63 secs. Flags: [] Duplicates Instrument:26301 FileAnnotation:58101-58105 PlateAnnotationLink:21551-21553 ScreenAnnotationLink:701,702 Channel:370001-370032 Image:144051-144066 LogicalChannel:205351 OriginalFile:539272-539416 Pixels:143201-143216 ChannelBinding:335151-335182 QuantumDef:133801-133816 RenderingDef:133801-133816 Thumbnail:133251-133266 Experiment:24151 Fileset:85701 FilesetEntry:386351-386489 FilesetJobLink:332301-332305 IndexingJob:358422 JobOriginalFileLink:115091 MetadataImportJob:358425 PixelDataJob:358423 ThumbnailGenerationJob:358421 UploadJob:358424 Ellipse:424951-424976,424978-424989,424991,424993,424995-424997,424999-425010,425012-425041,425043-425052,425054-425071,425073-425123,425125-425127,425129-425134 Mask:424977,424990,424992,424994,424998,425011,425042,425053,425072,425124,425128 Roi:409001-409184 Plate:24401 PlateAcquisition:24301 Screen:24151 ScreenPlateLink:24101 Well:54351-54366 WellSample:81101-81116 StatsInfo:312821-312852 (env562) pwalczysko@ls31618~$ ```

The plate looks okay also in webclient, and was successfully moved to another group.

mtbc commented 3 years ago

Thank you for all the testing.

mtbc commented 3 years ago

Alas, the error message in https://github.com/ome/omero-blitz/pull/100#issuecomment-692025803 is not as easily improved as I had initially hoped, it's very much a matter of where in the model subgraph an issue was detected, by machinery that behaves rather agnostically of the domain model. We know users (partly due to clients) probably know their image ID better than their channel IDs but the graphs machinery doesn't favor one over the other.

It's easy to find the associated image ID given the channel ID, e.g., omero hql "SELECT pixels.image.id FROM Channel WHERE id = 1234", but, in generating the error message, the graphs machinery has only a small subgraph in hand, nothing stretching enough hops to the image, so the message can't reference it. This comes partly back to https://github.com/ome/omero-blitz/pull/100#issuecomment-688199249 in that a gateway or client could then perform the helpful followup query.