Open bigcat88 opened 2 months ago
There are four properties in the file that might hold this information:
| | | Box: 4363e914-5b7d-4aab-97ae-bea6983b434 -----
| | | size: 28 (header size: 24)
| | |
| | | Box: 22cc4c7-d6d9-4e7-9d90-4eb6ecbaf3a3 -----
| | | size: 40 (header size: 24)
...
| | | Box: 4363e914-5b7d-4aab-97ae-bea6983b434 -----
| | | size: 32 (header size: 24)
| | |
| | | Box: de225085-36cb-4365-8743-2f875e7c78a -----
| | | size: 28 (header size: 24)
But we have no specification of those. They are proprietary. We would need a couple of images with known intrinsic parameters (covering several different values) to be able to reverse engineer their content.
There should be additional metadata for each picture. Using the Image I/O framework, they'd be defined like this:
let properties = [
kCGImagePropertyGroups: [
kCGImagePropertyGroupIndex: 0,
kCGImagePropertyGroupType: kCGImagePropertyGroupTypeStereoPair,
kCGImagePropertyGroupImageIndexLeft: 0,
kCGImagePropertyGroupImageIndexRight: 1,
],
kCGImagePropertyHEIFDictionary: [
kIIOMetadata_CameraModelKey: [
kIIOCameraModel_Intrinsics: cameraIntrinsics as CFArray
]
]
]
IMG_0050.zip Does this help? 3 images from the Apple Vision Pro.
Thanks. Could you please also send me the decoded metadata values that are stored in there? I don't have a Mac to read them out.
This good?
Looks like its the same in each case. Do they ever vary?
They can. In this case, because they were all shots on the Vision Pro in the same area the values are the same.
I have panoramic photos that will have different intrinsics, but the format is the same.
I suspect there's additional tags that Apple is using to determine this is a stereo pair. That's the code that I posted before.
So to work out which values correspond with which bytes (or bits) in those UUID fields, we need to see the variations. Ideally one parameter change at a time would vary a small amount of the values.
Hmm... I don't think I can provide that. The best I can do is more samples of photos that work on the Vision Pro by either taking photos with an iPhone or the AVP.
Hi, (I posted the report in the other repo)
Here is a set of 3 files focused on the "Camera model" field with the intrinsics matrix.
Example:
They are based on a similar code snippet as posted above with only variations in the kIIOCameraModel_Intrinsics
field.
It’s not possible to create the files with all zeros or just changing one value as the encoder tests that the matrix is valid.
Files and corresponding kIIOCameraModel_Intrinsics
value:
[1, 0, 0, 0, 1, 0, 0, 0, 1]
[2, 0, 2, 0, 2, 2, 0, 0, 1]
[100, 0, 100, 0, 100, 100, 0, 0, 1]
Note: I only used integers but this is an array of floats.
And here is a set of files focusing on the camera extrinsics key.
Example:
Files:
CoordinateSystemID: 0
, Position: [0, 0, 0]
, Rotation: [1, 0, 0, 0, 1, 0, 0, 0, 1]
.Position: [1, 0, 0]
Position: [0, 1, 0]
Position: [0, 0, 1]
CoordinateSystemID: 1
, however it still says 0 in the infobox. I don’t know if it’s the encoder or the decoder that’s not picking it up. I don’t know what this value represents.File 0 - two UUID properties, and each of the two items are associated with both.
Intrinsics matrix [1, 0, 0, 0, 1, 0, 0, 0, 1]
uuid: 22cc04c7d6d94e079d904eb6ecbaf3a3
value: 00001e00 0010624e 00000000 00000000
uuid: de22508536cb436587432f8705e7c78a
value: 00000000
File 1 - same two UUID properties, same association
Intrinsics matrix [2, 0, 2, 0, 2, 2, 0, 0, 1]
uuid: 22cc04c7d6d94e079d904eb6ecbaf3a3
value: 00001e00 0020c49c 0020c49c 0020c49c
uuid: de22508536cb436587432f8705e7c78a
value: 00000000
File 2 - same two UUID properties, same association
Intrinsics matrix [100, 0, 100, 0, 100, 100, 0, 0, 1]
uuid: 22cc04c7d6d94e079d904eb6ecbaf3a3
value: 00001e00 06666666 06666666 06666666
uuid: de22508536cb436587432f8705e7c78a
value: 00000000
Assume the 22cc04c7d6d94e079d904eb6ecbaf3a3
is the identifier for the intrinsics
So we have
00001e00 0010624e 00000000 00000000
for [1, 0, 0, 0, 1, 0, 0, 0, 1]
00001e00 0020c49c 0020c49c 0020c49c
for [2, 0, 2, 0, 2, 2, 0, 0, 1]
00001e00 06666666 06666666 06666666
for [100, 0, 100, 0, 100, 100, 0, 0, 1]
Its not clear to me how the 9 values could fit into 16 bytes unless there is some kind of encoding, possibly omitting some values that are defined as 0 (e.g. 4th, 7th and 8th values).
Possibly the 0x1e relates to a signature or encoded length (0x1e = 30, the number of bytes is 16).
A general intrinsic matrix usually looks like this:
f s x
0 f y
0 0 1
One can also assume that the skew s
= 0.
That would leave us with just the three parameters f
, x
, y
.
It is also nice to see that the encoding of 2
=0x20c49c
is exactly two times 1
=0x10624e
.
And if we divide 0x06666666 / 0x10624e
, we also get decimal 100 (almost). Seems to fit nicely.
Thus, first four bytes unknown, maybe some flags (e.g. for "ModelType = Simplified Pinhole")
Second four bytes: f
Third / fourth four bytes: x
/y
, but we need more data to differentiate that.
Each file has two uuid properties (boxes). Both images in each file are associated with both uuid properties.
One is uuid: de22508536cb436587432f8705e7c78a value: 00000000 as described above.
The other is more interesting. It is uuid: 4363e9145b7d4aab97aebea69803b434 and the property value changes byte values and length. See below
value: 00000010
CoordinateSystemID: 0, Position: [0, 0, 0], Rotation: [1, 0, 0, 0, 1, 0, 0, 0, 1].
value: 00000011 000f4240
CoordinateSystemID: 0, Position: [1, 0, 0], Rotation: [1, 0, 0, 0, 1, 0, 0, 0, 1].
value: 00000012 000f4240
CoordinateSystemID: 0, Position: [0, 1, 0], Rotation: [1, 0, 0, 0, 1, 0, 0, 0, 1].
value: 00000014 000f4240
CoordinateSystemID: 0, Position: [0, 0, 1], Rotation: [1, 0, 0, 0, 1, 0, 0, 0, 1].
value: 00000010
CoordinateSystemID: 0, Position: [0, 0, 0], Rotation: [1, 0, 0, 0, 1, 0, 0, 0, 1].
value: 00000010
CoordinateSystemID: 1, Position: [0, 0, 0], Rotation: [1, 0, 0, 0, 1, 0, 0, 0, 1].
File 10 also has an extra uuid box: Its the same as the assumed intrinsics box above. value: 00001e00 0010624e 00000000 00000000
So it looks like coordinate system id may be by position (not coded).
byte [7] is clearly changing as we step through the position changes.
000f4240
is 3.03216553 as a little endian float. Can't make that fit particularly well though.
also, if the photo is a spatial photo, there will be a 'grpl' box under 'meta' box where the 'grpl' Grouplistbox box will contain a 'ster' box that is Stereoscopic pair.
I have attached two identical images, but one of them, spatial.HEIC, is a Spatial Photo that add the key metadata in it. I think this can be easily compared. I want to fix this issue by myself but I'm not good in C++ that I don't know where to start.
I have attached two identical images, but one of them, spatial.HEIC, is a Spatial Photo that add the key metadata in it.
Can you show the associated key metadata (i.e. as apple displays it)?
There should be additional metadata for each picture. Using the Image I/O framework, they'd be defined like this:
let properties = [ kCGImagePropertyGroups: [ kCGImagePropertyGroupIndex: 0, kCGImagePropertyGroupType: kCGImagePropertyGroupTypeStereoPair, kCGImagePropertyGroupImageIndexLeft: 0, kCGImagePropertyGroupImageIndexRight: 1, ], kCGImagePropertyHEIFDictionary: [ kIIOMetadata_CameraModelKey: [ kIIOCameraModel_Intrinsics: cameraIntrinsics as CFArray ] ] ]
What I do to these 2 images is one image generated with above code, another without the code.
So we can compare their file structure using heif-info.exe -d
to these 2 images.
Or we can use some isobmff tool like pyisobmff.
I attached two file that generated by pyisobmff: pyisobmff_decode.zip
Just use the text comparing tools to check the difference. This is currently I can do so far. Also, if you know the box specific in the spatial images, you might use hex editor to search the brand and see what value it has.
Below are screenshots that spatial image contained more than non_spatial image:
Can you show the associated key metadata (i.e. as apple displays it)?
What the difference is the image @jwheeler and @JoanCharmant posted, the preview app in macOS will plus a tag "HEIC" that non-spatial image doesn't have.
Hi all, I found the UUID also related to the image resolution. If I changed the image resolution, the value will change, too.
Reading and writing of the camera intrinsic matrix should be working now in branch develop-v1.18.0
. Extrinsic matrix will follow shortly.
Is there some test data for the extrinsic camera matrix? Especially with camera orientation once specified as a quaternion and once with rotation angles?
Assuming its the same as cmin
and cmex
, there are test examples at
https://github.com/MPEGGroup/FileFormatConformance/pull/86
and
Assuming its the same as
cmin
andcmex
, there are test examples at
Thank you. That helped to confirm that intrinsics and quaternion-based rotation are read correctly. I have also chosen the rotation sequence order to match the output described in the rotation.txt
file at that repository.
Original issue with file example is here:
https://github.com/bigcat88/pillow_heif/issues/234
Is there a way to get this information from an image and save the file so it contains it?
I tried to find where it is stored using
heif-info -d
but totally get lost in number of different boxes those file contains..I would be grateful for any help.