SK-Hardwired / xavc_rtmd2srt

Extract real time meta-data and GPS tracks from Sony XAVC video
39 stars 7 forks source link

How did you do this? #1

Closed ahshah closed 4 years ago

ahshah commented 4 years ago

Hi there, thank you for creating this- I've been trying very hard to reverse Catalyst to see how the program can find the ISO information from a given MP4 file from my A7S camera. Its been an incredibly hard journey. I was curious how you managed to get all the relevant capture information and their locations in the file, just to further my knowledge. Thanks again!

SK-Hardwired commented 4 years ago

Hi! Sorry for late reply. Well, that was not easy. I was inspired by Catalyst Browse as well, but didn't try to reverse it... First of all I checked what data on Video files mediainfo, ffmpeg, mp4box and exiftool display. With exiftool there were some additional tags shown with -extractEmbedded parameters. These tags had timecode and block number was the same as frames numbers in video. Then I noticed that some of them has the same value as some parameters set (like ISO) and were changing during the video accordingly.

Then I found that ffmpeg shows some unknown rtmd stream in XAVC files. Found that mp4box can extract these streams in raw binary data file. This mp4box extract in HEX editor started from typical bytes sequence, and this bytes sequence repeated in file exactly how many frames file used with equal steps. So I found single metadata block size which was 1024 bytes.

Then it was just film and try method. I recorded some clips changing only one parameter (A, Shutter, ISO, lenszoom) and was looking in hex editor what bytes were changing accordingly. There I found the ticking timecodes, date-time, and these separate specific bytes and their "containers" inself.

Then there was a lot of googling about different bytes sequences and I stumbled upon on some Sony F65 / Cooke lens documentation and some pieces of metadata encoding standards which helped me to recognize and decode correctly part of other tags. Most of them appeared to MXF-like tags. Then I googled for exactly for exactly MXF tags whitepapers and found some similar in MediaInfo MXFparser code...

Then I noticed that in videos from action cam also some plain GPS-related text exist and some additional part of encoded data. Googled again and found that it looks like photoExifGPS-like encoded portions. And decoded them using standard GPSExif standards...

Step by step, trial by trial.

Current findings (in experiments folder) that newest ILCE and RX models have not 1024 but 3072 bytes blocks of per-frame metadata and RX0M2 and RX100M7 keep there many new and big portions of data and new tags. I recognized them as internal IMU (gyroscope, accelerometer) and Optical SteadyShot movement data tables and now you can extract them into table and draw very nice charts in excel :). Also there are some encoded tables which look like very much to: Lens distortion parameters, Sensor state table (?) and some other unknown tags. So my conclusion is this new metadata is used for accurate Image Stablisization (with geometry corrections) of footage in Movie Edit Add-on app (for RX0M2). Also this helps to fix rolling shutter in this app - at least stabilized footage in the app almost has no rolling shutter effect compared to unprocessed original.

In new sony professional FX9 camera announcement there is note that camera has IMU and records movements data for better stabilisation in post production. They promised that Dec'2019 version of Catalyst Browse could utilize this metadata. I guess, this will be the same format :)