theori-io / nrsc5

NRSC-5 receiver for rtl-sdr
Other
805 stars 100 forks source link

Expose stream & packet data through the API #308

Closed argilo closed 1 year ago

argilo commented 1 year ago

There are three AAS data types used by HD Radio stations: Stream, Packet, and Large Object Transfer (LOT). Of these, only LOT files are currently exposed through the nrsc5 API. There is a bit of code in output.c aimed at decoding Stream data, but it is not functional.

Since there are various Stream data types (HERE TPEG, HERE Images, TTN TPEG) and Packet data types (HD TMC, NavteqPacketData1, NavteqAdmin) in use, I think it would make sense to expose the raw Stream & Packet payloads through the API, and make the consuming application responsible for decoding the data type(s) it cares about.

Here I've added NRSC5_EVENT_STREAM and NRSC5_EVENT_PACKET event types to the API. Each of these reports the 16-bit port number, payload size, 32-bit MIME hash (taken from the corresponding entry in the SIG table), and payload data.

For the moment, the C & Python CLI apps simply report that data is present.

C:

14:50:27 Packet data: port=0403 mime=2D42AC3E size=8
14:50:27 Stream data: port=0404 mime=B7F03DFC size=646
14:50:27 Stream data: port=0402 mime=82F03DFC size=814
14:50:27 Packet data: port=0403 mime=2D42AC3E size=8
14:50:27 Packet data: port=0401 mime=2D42AC3E size=928
14:50:27 Stream data: port=0404 mime=B7F03DFC size=647

Python:

14:51:13 Packet data: port=0403 mime=MIMEType.NAVTEQ size=8
14:51:13 Stream data: port=0404 mime=MIMEType.HERE_IMAGE size=646
14:51:13 Stream data: port=0402 mime=MIMEType.HERE_TPEG size=814
14:51:13 Packet data: port=0403 mime=MIMEType.NAVTEQ size=8
14:51:13 Packet data: port=0401 mime=MIMEType.NAVTEQ size=928
14:51:13 Stream data: port=0404 mime=MIMEType.HERE_IMAGE size=647
argilo commented 1 year ago

Here's some sample code that demonstrates how to extract images from the "HERE Images" stream using the API:

diff --git a/support/cli.py b/support/cli.py
index ca1322f..17559e9 100755
--- a/support/cli.py
+++ b/support/cli.py
@@ -4,6 +4,7 @@ import argparse
 import logging
 import os
 import queue
+import struct
 import sys
 import threading
 import wave
@@ -23,6 +24,7 @@ class NRSC5CLI:
         self.iq_output = None
         self.wav_output = None
         self.hdc_output = None
+        self.here_image_data = bytes()

     def parse_args(self):
         parser = argparse.ArgumentParser(description="Receive NRSC-5 signals.")
@@ -177,6 +179,45 @@ class NRSC5CLI:
             0xfc
         ])

+    def parse_here_image(self, data):
+        header = bytes([0xff, 0xf7, 0xff, 0xf7])
+        self.here_image_data = self.here_image_data + data
+        while True:
+            index = self.here_image_data.find(header)
+            if index == -1:
+                break
+            self.here_image_data = self.here_image_data[index:]
+            if len(self.here_image_data) < 6:
+                break
+
+            offset = 4
+
+            payload_length = struct.unpack(">H", self.here_image_data[offset:offset + 2])[0]
+            offset += 2
+            if len(self.here_image_data) < 8 + payload_length:
+                break
+
+            # unknown data
+            offset += 27
+
+            filename_len = self.here_image_data[offset]
+            offset += 1
+
+            filename = self.here_image_data[offset:offset + filename_len].decode()
+            offset += filename_len
+
+            # unknown data
+            offset += 4
+
+            file_length = struct.unpack(">H", self.here_image_data[offset:offset + 2])[0]
+            offset += 2
+
+            with open(filename, "wb") as f:
+                f.write(self.here_image_data[offset:offset + file_length])
+
+            self.here_image_data = self.here_image_data[8 + payload_length:]
+
+
     def callback(self, evt_type, evt):
         if evt_type == nrsc5.EventType.LOST_DEVICE:
             logging.info("Lost device")
@@ -236,6 +277,8 @@ class NRSC5CLI:
         elif evt_type == nrsc5.EventType.STREAM:
             logging.info("Stream data: port=%04X mime=%s size=%s",
                          evt.port, evt.mime, len(evt.data))
+            if evt.mime == nrsc5.MIMEType.HERE_IMAGE:
+                self.parse_here_image(evt.data)
         elif evt_type == nrsc5.EventType.PACKET:
             logging.info("Packet data: port=%04X mime=%s size=%s",
                          evt.port, evt.mime, len(evt.data))

This particular stream seems to have the same sorts of traffic tiles & weather radar overlays that would be found in the "TTN STM" LOT files on other stations.

markjfine commented 1 year ago

Almost exactly like the iHeart data. This is awesome.

argilo commented 1 year ago

Yep, it looks very similar, the data is just in a different place.

In my collection of recordings, I found seven stations that were doing it this way.

markjfine commented 1 year ago

I got this from WKYS in DC: image These don't have the timestamps embedded in the filenames, just a 4 character code after the row/column (this one's 6hvu), unless that info is inserted from one of the control feeds. Otherwise there's no way to know if all 9 parts are from the same image.

Any consideration to have trafficMap.png and WeatherImage.png auto-dumped using a command switch like with --dump-aas-files?

argilo commented 1 year ago

Otherwise there's no way to know if all 9 parts are from the same image.

Some of the "# unknown data" that my crude parser is skipping over appears to be sequencing information.

Any consideration to have trafficMap.png and WeatherImage.png auto-dumped using a command switch like with --dump-aas-files?

Yes, I had that idea as well. I might do that once we understand the stream data a bit better.

markjfine commented 1 year ago

Thought about this a bit after looking at Navteq/HERE behaviour. If the common timestamp isn't there, one way to try to ensure the image pieces are correctly synced is to start accumulating parts from 0_0, and ignore anything started in the middle of the cycle.

Of course if you miss one piece out of the 9 due to a dropout or some other error, that'd have to be accounted for as well. The whole process takes about 5' to get all 9 pieces so it'd be a shame (and very inefficient) to throw away the whole thing just because one piece errored out.

markjfine commented 1 year ago

Similar to traffic, the weather overlay is missing the applicable grid and scale info (03ggta is Wash DC for iHeart). This meta data may also be included within the 'unknown data'. That withstanding, was able to artificially construct the following by forcing the map_id in nrsc5-gui: image Doesn't seem to correlate to anything my weather app, which shows no precip or clouds in this area. Also, disregard the timestamp, which I think went 4h in the wrong direction. 😂

markjfine commented 1 year ago

After comparing the pattern of afternoon squall cells with The Weather Channel app, I'm going to go out on a limb and say the 600 x 600 weather overlay is meant to go on top of something akin to the traffic map: image

markjfine commented 1 year ago

Took an initial stab at trying to ID parts of the unknown data sections. The first section is 27 bytes long: 1 - Serial ID number. Seems to increase by 1 for each image instance. 4 - Tile number in a multi-tiled image (traffic only) 6 - # tiles in a multi-tiled image (traffic only) 10-13 - Appears to be a 32-bit timestamp. Difference between each full traffic or weather image is around 300 decimal seconds (\x012c), exactly 5 minutes. 16-27 - Appears to be the lower-left and upper-right coordinates of each tile (traffic) or image (weather). If this is truly the case, judging from the data I collected, the weather overlay does not exactly match the full traffic image, so a background image would need to be created. The only thing that bothers me is that each coordinate appears to be only 24 bits. Not exactly sure how that plays.

The second section is only 4 bytes long, but they are consistent. Traffic tiles are always \x00\x63\x00\x00, and weather overlays are always \x04\x21\x00\x00. Perhaps the first two bytes help identify what it is?

I've attached the raw data for each 27-byte and 4-byte section from three sets of traffic images and three weather images. Unknown data.txt

argilo commented 1 year ago

I did some analysis as well. Comments below:

1 - Serial ID number. Seems to increase by 1 for each image instance.

It seems like the first nibble is always d for weather and 8 for traffic. The second nibble does appear to be a counter which goes from 1 to f (and skips 0?).

4 - Tile number in a multi-tiled image (traffic only) 6 - # tiles in a multi-tiled image (traffic only)

For weather, these seem to be counters that go up by one each time the image changes.

10-13 - Appears to be a 32-bit timestamp.

Yep, looks like a standard Unix timestamp (seconds since January 1, 1970 UTC).

16-27 - Appears to be the lower-left and upper-right coordinates of each tile (traffic) or image (weather).

Yeah, 15-27 appear to be coordinates. I think the bits are split out like so:

first bit - sign of latitude next 25 bits - latitude 100000 next bit - sign of longitude next 25 bits - longitude 100000

And then another set of 52 bits with the second set of coordinates.

For weather it does appear to be the upper left and lower right corners that are encoded. For traffic, the coordinates are weird and seem to extend beyond where the tiles actually are. (Luckily, knowing the coordinates of the traffic tiles isn't so important.)

argilo commented 1 year ago

Strangely, in my old recordings (2016-2017), the coordinates seem to be 26 bits, latitude 200000 and longitude 200000, with no sign bit. Maybe it was decided later that there should be a sign bit so that the other three quarters of the earth can be covered. :-)

argilo commented 1 year ago

Given what we now know, it should be possible to update nrsc5-gui to decode & display the HERE Images stream.

markjfine commented 1 year ago

Interesting. I wondered if the 15th byte played a part, because it would make sense that the tiles would go from 0x0ef7 to 0x0f0c. Guess I was looking for symmetry.

markjfine commented 1 year ago

Given what we now know, it should be possible to update nrsc5-gui to decode & display the HERE Images stream.

That's what I've been working with. I modified my v2 of -gui to generate the sample images above.

argilo commented 1 year ago

A few more observations:

markjfine commented 1 year ago

Having just worked out the bits, it would appear that the corners in my overlay would be 39.41620N, 77.45359W and 38.59220N, 76.62960W. So yes... makes sense - upper left and lower right. We should have Navteq/HERE's specs reverse engineered in no time.😂

markjfine commented 1 year ago

Let -gui create a background map using the weather coordinates... Not nearly as broad a swath as iHeart's, but it's something. OpenMap creates a map that's 150x193 that a massive 600x600 overlay gets scaled onto: image The scaled down timestamp in the corner is barely readable.😂

Worthwhile to note that although the weather seems to generate a new overlay/timestamp every 5', the actual overlay only changes on every other image... so really only 10' updates.

markjfine commented 1 year ago

Just in case anyone needs it, here is some Python code to pull the timestamp and map coordinates out of the received file:

    def get_here_image_timestamp(self):
        offset = 15
        timestamp = struct.unpack(">L", self.here_image_data[offset:offset + 4])[0]
        return timestamp

    def get_here_image_bounds(self):
        map_coords = [0.0,0.0,0.0,0.0]
        offset = 20
        mask1 = 0x7fffffc0
        mask2 = 0x80
        shift = 6
        i = 0
        while (i < 4):
            bound = struct.unpack(">L", self.here_image_data[offset:offset + 4])[0]
            map_coords[i] = int((bound & mask1) >> shift) / 100000
            if (self.here_image_data[offset] & mask2) == mask2:
                map_coords[i] = -map_coords[i]
            offset += 3
            mask1 = mask1 >> 2
            mask2 = mask2 >> 2
            shift -= 2
            i += 1
        return map_coords

My original intent was to call these individually within process_traffic_map and process_weather_overlay, but that's rather problematic: These work off the here_image_data buffer, which already may contain the next received traffic or weather image and would provide erratic results. So they really should be called within parse_here_image and have the information parked in a class global, as such:

            # unknown data
            offset += 27

            self.last_here_timestamp = self.get_here_image_timestamp()
            self.last_here_bounds = self.get_here_image_bounds()

            filename_len = self.here_image_data[offset]
            offset += 1

Then, last_here_timestamp and last_here_bounds can be used within the traffic and weather image processing sections referenced above.