liujiusheng / blog

个人博客,blog
19 stars 0 forks source link

tiff文件标准学习 #237

Open liujiusheng opened 2 years ago

liujiusheng commented 2 years ago

首先要强调

进制和数据类型是两个不同层面的概念,一种数据类型可以用二进制、8进制、10进制、16进制表示,是一种多对多关系。

go语言里面的byte就是uint8数据类型,tiff中要写入的就是它。

0x开头可以表示此数据为16进制,0o为8进制,0b为二进制。

正文

在网上冲浪了一大圈居然发现没人能把tiff这种图像格式说得很明白,也没有找到很好易懂的官方文档

文件标准地址:

https://developer.adobe.com/content/dam/udp/en/open/standards/tiff/TIFF6.pdf

https://blog.csdn.net/han_jiang_xue/article/details/8266207

https://www.loc.gov/preservation/digital/formats/content/tiff_tags.shtml

https://www.cnblogs.com/MetaWang/p/10024243.html

https://blog.csdn.net/weixin_45847421/article/details/115189496

很多人可能会疑惑的字节序,II和MM具体什么意思:

https://www.ruanyifeng.com/blog/2016/11/byte-order.html

tiff中实际的像素值存储的地方可能叫“带”(strip):

https://www.likecs.com/show-205267332.html

ascii码编码表:

https://baike.baidu.com/item/ASCII/309296?fr=aladdin

影像数据和dem数据主要以geotiff文件存储,geotiff又是tiff文件的一种升级,所以要学习空间插值并生成geotiff就需要先学习tiff文件格式。

用GO最终写入文件的是byte,用Node.js最终写入文件的是Uint8Array

源代码可以研究这个工程

https://github.com/geotiffjs/geotiff.js

以下这张图片实际是一张用16进制格式表示的表,里面存的都是一些标志,需要查表知道每个编码对应的是什么,比如:16进制的49的ascii码就是对应的I,2a00就是对应的42,10进制的42的ascii码为*

image

Byte Order中的II和MM是啥意思?是不是指字节是从右向左读取还是从左向右读取?

“标签(Target)”就是我们指的属性

以下是翻译:

简介

TIFF格式因为其灵活和可扩展的特性长期应用于数据图像领域。

TIFF文件通过文件头中的标签和文件目录中的文件来定义。

标签可以指示图像的基本几何形状,定义图像数据的排列方式,并且定义一些要素,比如:是否使用了一种或者另一种图像压缩算法。

1992 TIFF 6.0 specification标准只定义了两种标签分类:baseline and extended。

该规范还建立了进一步扩展的程序,这导致产生了额外的两个标签分类:private 、private IFD。

private 、private IFD已经在一些附加规范中进行了编码:TIFF/EP (ISO 12234-2, 2001), TIFF/IT (ISO 12639, 2004), DNG和EXIF_2_2,后面会进行进详细的讨论。

尽管本网站(撰写本文时)的DNG格式描述描述了版本1.1 (DNG_1_1;由Adobe在2005年发布),下面的标签列表包括DNG版本1.2,1.3和1.4(2012)中添加的新标签。

DE代表图像数据,一般说TIFF包含三部分:文件头(简称IFH)、文件目录(简称IFD)、图像数据。

实际图像数据部分又可以分为图像的属性数据和像素数据。

实际上要将以下字符以字符写入文件,而不是去写它的16进制码

image

写入成功的效果:

image

用Node.js实现的代码如下(这里只关心前面的II两个字符,后面没做处理):

const fs = require("fs")
const data = `II 2A  00 B6 96 37  02  13 00 00 01  03  00  01  00`
fs.writeFileSync("./test.tiff", data)

Node.js中用charCodeAt()方法来获取ASCII码的10进制编码。charCodeAt() 方法可返回指定位置的字符的 Unicode 编码,返回值是 0 - 65535 之间的整数,表示给定索引处的 UTF-16 代码单元。

字符串中第一个字符的位置为 0, 第二个字符位置为 1,以此类推。

ASCII英文字母I在16进制中用49表示,在10进制中为73,16进制转10进制可以用parseInt(49,16)

标签规范

TIFF Baseline.

The baseline set of tags were documented in TIFF 5.0 and carried over on pages 11-47 of the 1992 TIFF 6.0 specification.

TIFF Extended.

The extended set includes some additional tags and added values for existing tags, as documented on pages 48-115 of the TIFF 6.0 specification.

TIFF Private

Originally, the term private meant just that. The TIFF 6.0 specification (page 8) states, "An organization might wish to store information meaningful to only that organization . . . . Tags numbered 32768 or higher, sometimes called private tags, are reserved for that purpose. Upon request, the TIFF administrator . . . will allocate and register one or more private tags for an organization . . . . You do not need to tell the TIFF administrator what you plan to use them for, but giving us this information may help other developers to avoid some duplication of effort." Over time, however, many private tags have become well established and well documented, e.g., tag 34675 for the ICC profile, dubbed InterColorProfile in the TIFF/EP standard. Thus, many members of the private tag class can be viewed as open extensions rather than as containers for secret information.

TIFF/EP, TIFF/IT, and DNG

A number of tags, some of which may once have been "private," have been defined in TIFF/EP (ISO 12234-2, 2001), TIFF/IT (ISO 12639, 2004), and DNG_1_1, an Adobe-sponsored extension of the TIFF 6.0 specification. As noted above, although this Web site's most recent description for DNG is version 1.1 (specfication published 2005), this tag list includes new tags added in versions 1.2, 1.3, and 1.4.

• TIFF Private IFD The TIFF 6.0 specification (page 9) states, "If you need more than 10 tags, we suggest that you reserve a single private tag, define it as a LONG TIFF data type, and use its value as a pointer (offset) to a private IFD [image file directory] or other data structure of your choosing. Within that IFD, you can use whatever tags you want, since no one else will know that it is an IFD unless you tell them." As with private tags, we can understand private IFDs as an extension to TIFF, often very public and well documented.

The private IFD tags of greatest interest to the Library of Congress are those associated with the EXIF_2_2 specification, pertaining to image generation by digital still cameras. Exif is an abbreviation for EXchangeable Image File format, although Exif does not relate to TIFF as, say, JFIF relates to JPEG_DCT. The Exif IFD is pointed to by the Private Exif IFD tag 34665. This and other Exif tags are listed in the numerical table below.

For links to Exif specifications and other related information, see EXIF_2_2. The Exif standard established three private IFDs, and tags from the main Exif IFD are included in the list below. Tags for the other two Exif IFDs are listed at the Aware Web site: GPS IFD, for positioning information, and Interoperability IFD, used to encode compability information. With numerical sequences of their own, the GPS and Interoperability tags are not included in the table below. A third-party listing of all Exif tags is available from Exiv2.

HD Photo tags.

Although not a true TIFF implementation, Windows Media Photo/HD Photo (WMP_1_0; now standardized as JPEG XR, ISO/IEC 29199-2:2012) employs a container format that borrows from TIFF and adds a few new tags of interest, further extending the tag set in at least an informal sense