German-BioImaging / omero-autotag

Extensions to OMERO.web to enhance tagging
GNU Affero General Public License v3.0
0 stars 1 forks source link

Use image name instead of filename by choice #2

Closed pwalczysko closed 2 months ago

pwalczysko commented 9 years ago

Import test_images_good/lei/leica-original or go to trout and find the data under project 5.0 data under any user. Go to Auto Tag. The central pane does not load properly (missing any tickable boxes), see screenshot. screen shot 2015-03-31 at 16 19 37

dpwrussell commented 9 years ago

This is actually nothing to do with the length of the filename. The component of the full filename that is not Path of Extension is, e.g. 050118. I made the decision to exlcude all tokens that are solely numerical unless that token already has a tag. The reason for this is that otherwise huge tables get drawn for any set of files that have a sequence number. There isn't really a way around that.

carandraug commented 8 years ago

Hi Douglas, I think there is another take on this issue. For files that have a concept of image name different from the filename (typical for formats that have a series of images, e.g., volocity), auto-tag should use the image name instead of the filename.

This is the case we are facing now. We have an image file like so:

$ ls -R data/PE_SpiningDisc/Venus_Screen/20160511_199/
data/PE_SpiningDisc/Venus_Screen/20160511_199/:
20160511_199.mvd2  Data  desktop.ini  Folder.ico

data/PE_SpiningDisc/Venus_Screen/20160511_199/Data:
10.aiix  18.aiix  25.aiix  32.aiix  39.aisf  50.aiix  58.aiix  64.aisf  72.aiix  79.aisf  8.aiix
10.aisf  18.aisf  25.aisf  32.aisf  3.aiix   50.aisf  58.aisf  65.aiix  72.aisf  7.aiix   8.aisf
11.atsf  19.aiix  26.atsf  33.aiix  3.aisf   51.atsf  59.aiix  65.aisf  73.aiix  7.aisf   9.aiix
12.aiix  19.aisf  27.aiix  33.aisf  40.aiix  52.aiix  59.aisf  66.atsf  73.aisf  80.aiix  9.aisf
12.aisf  1.atsf   27.aisf  34.aiix  40.aisf  52.aisf  5.aiix   67.aiix  74.aiix  80.aisf
13.aiix  20.aiix  28.aiix  34.aisf  46.atsf  53.aiix  5.aisf   67.aisf  74.aisf  81.atsf
13.aisf  20.aisf  28.aisf  35.aiix  47.aiix  53.aisf  60.aiix  68.aiix  75.aiix  82.aiix
14.aiix  21.atsf  29.aiix  35.aisf  47.aisf  54.aiix  60.aisf  68.aisf  75.aisf  82.aisf
14.aisf  22.aiix  29.aisf  36.atsf  48.aiix  54.aisf  61.atsf  69.aiix  76.atsf  83.aiix
15.aiix  22.aisf  2.aiix   37.aiix  48.aisf  55.aiix  62.aiix  69.aisf  77.aiix  83.aisf
15.aisf  23.aiix  2.aisf   37.aisf  49.aiix  55.aisf  62.aisf  6.atsf   77.aisf  84.aiix
16.atsf  23.aisf  30.aiix  38.aiix  49.aisf  56.atsf  63.aiix  70.aiix  78.aiix  84.aisf
17.aiix  24.aiix  30.aisf  38.aisf  4.aiix   57.aiix  63.aisf  70.aisf  78.aisf  85.aiix
17.aisf  24.aisf  31.atsf  39.aiix  4.aisf   57.aisf  64.aiix  71.atsf  79.aiix  85.aisf

While importing the image to omero, you only select "20160511_199.mvd2" (which then picks up all the files within Data/). These files are actually 16 images, and each one has a name besides any of the filenames. This is their names in omero (bioformats reads their names automatically):

                                    name                                     
-----------------------------------------------------------------------------
 20160511_199.mvd2 [20160406_199_NMJ_larvae1_mRNA647_HRP405_DAPI]
 20160511_199.mvd2 [20160406_199_NMJ_larvae1_mRNA647_HRP405_DAPI 2]
 20160511_199.mvd2 [20160406_199_NMJ_larvae1_mRNA647_HRP405_DAPI]
 20160511_199.mvd2 [20160406_199_nerve_larvae1_mRNA647_HRP405_DAPI]
 20160511_199.mvd2 [20160406_199_OL_larvae1_mRNA647_HRP405_DAPI]
 20160511_199.mvd2 [20160406_199_nerve_larvae1_mRNA647_HRP405_DAPI]
 20160511_199.mvd2 [20160406_199_nerve_larvae1_mRNA647_HRP405_DAPI]
 20160511_199.mvd2 [20160406_199_NMJ_larvae1_mRNA647_HRP405_DAPI 2]
 20160511_199.mvd2 [20160406_199_OL_larvae2_mRNA647_HRP405_DAPI]
 20160511_199.mvd2 [20160406_199_OL_larvae2_mRNA647_HRP405_DAPI 2]
 20160511_199.mvd2 [20160406_199_nerve_larvae2_mRNA647_HRP405_DAPI 3]
 20160511_199.mvd2 [20160406_199_OL_larvae3_mRNA647_HRP405_DAPI]
 20160511_199.mvd2 [20160406_199_OL_larvae3_mRNA647_HRP405_DAPI]
 20160511_199.mvd2 [20160406_199_abdominalVNC_larvae3_mRNA647_HRP405_DAPI]
 20160511_199.mvd2 [20160406_199_abdominalVNC_larvae3_mRNA647_HRP405_DAPI 2]
 20160511_199.mvd2 [20160406_199_abdominalVNC_larvae3_mRNA647_HRP405_DAPI 3]
(16 rows)

Ideally, auto-tag would pick up the image name which has abdominal, larvae3, HRP405, etc. But instead, it only sees the path up to the file, and then the image ID.

I believe this is the real issue behind the original report, so I'm commenting here instead of opening a new one.

carandraug commented 8 years ago

I will just add the current workaround people have been doing here. They open the image in imagej (which opens all dataset in one go with the right filename), save them as individual ome.tiffs, and then import thosein omero. This is all so that auto-tag can be used (at the cost of losing the original files with the raw metadata which is obviously not ideal).

dpwrussell commented 8 years ago

I could potentially add this as a choice in the new version without a huge amount of difficulty. It would have to be a global though, so if you had a mixed dataset it would not be that useful.

Alternately I could make it that instead of a choice, it added the parsing of the image name to the parsing of the filename. Would that work better? Then it would work in mixed datasets.

Are you running OMERO 5.2 and Webtagging 2.0?

carandraug commented 8 years ago

I could potentially add this as a choice in the new version without a huge amount of difficulty. It would have to be a global though, so if you had a mixed dataset it would not be that useful.

Alternately I could make it that instead of a choice, it added the parsing of the image name to the parsing of the filename. Would that work better? Then it would work in mixed datasets.

I think the later would work better. But why is it parsing the file name instead of image name by default? In the case where one image is one file, the image name matches the file name, so whether auto-tag parses one or the other won't make a difference (unless the user changed the image name himself, but then there's still a case to be made to have auto-tag pick up those changes instead of the original filename).

So maybe auto-tag should be parsing the image name by default and only parse the filename by selecting the "Path" option?

Are you running OMERO 5.2 and Webtagging 2.0?

omero 5.2.2 and webtagging 1.3.0. We are planning on upgrading but after the ome users meeting.

mtbc commented 8 years ago

In the case where one image is one file, the image name matches the file name ...

That's how it happens to be for many readers but I don't think there's any promise or guarantee of this anywhere so I'd be doubtful about relying on it. Also, you probably do want it using the actual filename so it can get useful metadata from the path, as the names of containing directories may be significant.

carandraug commented 8 years ago

In the case where one image is one file, the image name matches the file name ... That's how it happens to be for many readers but I don't think there's any promise or guarantee of this anywhere so I'd be doubtful about relying on it.

Maybe so. But we should be able to rely on omero picking the best image name possible, whatever that is.

On the typical case of one file equalling one image, there is often no concept of image name which is why the users place the image name on its filename. And it's the image name, not the filename, that has the information most useful for auto-tag. That's my view at least, this is dependent on use cases.

Also, you probably do want it using the actual filename so it can get useful metadata from the path, as the names of containing directories may be significant.

At the moment, auto-tag does not use the path by default, only the basename part. There is an option to include the rest of the filepath. This suggests that auto-tag treats the path with less importance. So why not use the image name only by default. The basename plus path can still be parsed for tags by using the existing "path" option.

dpwrussell commented 8 years ago

@mtbc Thanks, that's what I was going to say, but you got there first.

@carandraug

Actually, auto-tag does now use the path by default specifically because it (or some of it at least) is of the same importance as the name. In 2.0 there is no distinction between tokens derived from path, extension or filename.

I don't disagree that adding support for the image name is a good idea. It might be that after some testing we just use both and that's that, but we'll have to see how that goes. At the least, I should be able to add functionality to optionally use the image name. The work that needs to go into this is very similar to what has to be done to enable a customised separator option (#49), so I would implement this at the same time as that.