s-u / tiff

Read and write TIFF images in R
10 stars 6 forks source link

Better handling of "Unknown field with tag" warning #2

Open richierocks opened 8 years ago

richierocks commented 8 years ago

When calling readTIFF I quite often get lots of "Unknown field with tag" warnings, which aren't very useful. (Try unzipping the attachment and calling readTIFF("farouq.tiff"); you should see 17 warnings.)

farouq.zip

I realise that I can suppressWarnings when I read in the file, but I don't want to do this in general since I might miss other useful warnings.

Ideally, it would be nice to have these tags read and stored as attributes; failing that it would be nice to be able to turn off these warnings (without having to suppress all warnings).

More discussion about these warnings in this SO question.

s-u commented 8 years ago

I'd love to support reading or ignoring the tags, but, unfortunately, the underlying libtiff library doesn't' support that - AFAICS. Those warnings are created directly by TIFFReadDirectory function in libtiff and R/tiff has no control over them - there is not even a way to find out the semantic meaning - it's essentially a random string passed to R. The libtiff documentation simply states that unknown tags are ignored with a warning, so I don't see a way to change the behavior or get the tags :(.

rorynolan commented 5 years ago

Hi @richierocks The ijtiff package (available on CRAN at https://cran.r-project.org/package=ijtiff) ignores "Unknown field with tag . . ." warnings internally. Maybe you could try that. Disclaimer: I'm the maintainer of ijtiff and it uses a lot of code that I found in the tiff package.

s-u commented 3 years ago

I see ijtiff just greps through the warnings and ignores those containing the text "Unknown field with tag". That seems like a real hack to me (quite fragile - imagine if libtiff finally adds localisation) that is only masking the problem.

The underlying issue is much more complex: TIFF images can have many interesting tags, but each tag has its own definition of the binary content. Unfortunately, that makes is impossible to write a general solution.

So a better approach would be to support at least the most common tags. The sample image above includes EXIF information which is what quite a few of the tags are. It is surprising that out-of-the-box libtiff doesn't support EXIF, but it does provide ways to add new tags. So the proposal would be to register commonly used tags that can be useful, and perhaps even provide R-level API to allow the user to query a particular tag and may be non-standard (even if that is a bit dangerous).

Some links to lists of defined tags that could be used as a basis: https://www.loc.gov/preservation/digital/formats/content/tiff_tags.shtml https://docs.microsoft.com/en-us/windows/win32/gdiplus/-gdiplus-constant-property-item-descriptions and libtiff's own list: http://www.libtiff.org/man/TIFFGetField.3t.html

rorynolan commented 3 years ago

I'm the author of ijtiff and I agree that it is hacky. It's designed to cope with a hacky way that ImageJ encodes channel information sometimes. This is useful to many, but tiff is certainly the purists' package. With regard to ignoring all "Unknown field with tag" warnings, this is what novice users want most of the time. Otherwise, this warning gives them the impression that something has gone wrong during reading, when probably it's returning everything they care about. With regard to the tags, I think the tiff package as is does a great job supporting several. ijtiff supports these two (by stealing tiff code). If someone wants another tag supported, I would be happy to add it to tiff and ijtiff via pull request. But there are too many tags to support them all when most will never be used by an R user. It's too much work for not enough payoff.