Full local control of camera possible with cloud disabled.

74ls04 commented 2 years ago

While troubleshooting some RTSP streaming issues I got majorly sidetracked and ended up looking more into the camera in general. Here are a few things I discovered:

The camera uses the ThroughTek (TUTK) P2P protocol. This DEFCON talk should tell you all you need to know about it. I was able to parse the traffic on my local network and see the packet contents. I was also able to do a UDP probe on my local network and find all my cameras and their UIDs. The binary p2p_tnp handles the P2P protocol including streaming the video.
It is possible to do almost everything the app can do without the cloud features enabled fairly easily. I was able to find the partial sources for p2p_tnp, cloudAPI, and watch_process. These sources are for a later version of the camera but the code is compatible.

The following control functions are available (this list is incomplete) and they just send IPC commands to the dispatcher program:

set_voice_ctrl()
set_viewpoint_trace()
set_video_backup_state()
set_tnp_work_mode()
set_tnp_init_status()
set_tnp_connect_success()
set_tnp_check_login_success()
set_tnp_check_login_fail()
set_pwd_used_cnt()
set_power()
set_panorama_capture_state()
set_motion_record()
set_motion_detect()
set_mirror_flip()
set_mic_volume()
set_light()
set_ldc()
set_lapse_video()
set_immediate_bitrate()
set_high_resolution ()
set_encode_mode()
set_debug_info()
set_day_night_mode()
set_baby_cry()
set_audio_mode()
set_alarm_sensitivity()
set_alarm_mode()
set_abnormal_sound_sensitivity()
set_abnormal_sound()

I created this function in ipc_cmd.c

typedef struct
{
    int dst_id;
    int src_id;
    short main_op;
    short sub_op;
    int msg_len;
} mqueue_msg_header_t;

int send_rmm_msg(mqd_t mqfd, int msg_type, char *payload, int payload_len)
{
    mqueue_msg_header_t msg;
    char send_buf[1024] = {0};
    int send_len = 0;

    memset(&msg, 0, sizeof(msg));
    msg.dst_id = 1; // mid = 1, rmm = 2, cloud = 4
    msg.src_id = 8;
    msg.main_op = msg_type;
    msg.sub_op = 1;
    msg.msg_len = payload_len;

    memcpy(send_buf, &msg, sizeof(msg));

    if ((NULL != payload) && (payload_len > 0))
    {
        memcpy(send_buf + sizeof(msg), payload, payload_len);
    }

    send_len = sizeof(msg) + payload_len;

    if (mqfd == -1 || send_len > 512)
    {
        fprintf(stderr, "Invalid mqfd or payload_len\n");
        return -1;
    }

    mq_send(mqfd, send_buf, send_len, 1);
}

and was able to successfully toggle the LED on and off

    if (led == LED_OFF)
    {
        send_rmm_msg(ipc_mq, 0x77, NULL, 0);
    }
    else if (led == LED_ON)
    {
        send_rmm_msg(ipc_mq, 0x76, NULL, 0);
    }

The relevant get_() functions can be created by reading the values from /tmp/mmap.info which is the single most important file on the camera and controls everything/stores all the camera settings and secrets!

Here is a partial list of the values stored in mmap.info extracted from the p2p_tnp source. It's just a matter of comparing the location of the string to the decompiled binary to get the offsets. I've been able to do this for a few and can verify that they are indeed correct. Unfortunately I don't have a lot of extra time right now to do the full mapping.

abnormal_sound_enable
abnormal_sound_sensitivity
aec_key
alarm
alarm_day_ctx.day_count
alarm_day_ctx.time[i]
alarm_event_ctx.arr[i].duration
alarm_event_ctx.arr[i].start_time
alarm_event_ctx.arr[i].type
alarm_event_ctx.num
ap_enable
ap_tnp_did
api_server
auto_ota_enable
baby_cry_enable
check_net_disconnected
cloud_storage_enable
day_night_mode
debug_mode
did
dlproto
encode_mode
high_resolution
human_face_enable
human_motion_enable
hw_ver.hor
hw_ver.ptz_func
hw_ver.ver
hw_ver.white_led_close
in_packet_loss
init_finish
irlight_mode
is_sd_exist
is_xiaomirouter
key
language
lapse_video_enable
lapse_video_end_time
ldc_percent
light_mode
mac
mic_audio_enable
mic_volume
mirror
motion_detection_enable
motion_rect.bottom
motion_rect.left
motion_rect.mode
motion_rect.resolution
motion_rect.right
motion_rect.top
motion_sensitivity
out_packet_loss
p2p_viewing_cnt
p2pid
panorama_capture_count
panorama_capture_state
power_mode
pre_pwd
ptz_cruise_end_time
ptz_cruise_flag
ptz_cruise_mode
ptz_cruise_start_time
ptz_info[i].preset_enable
ptz_info[preset_id
ptz_motion_track_switch
ptz_panoramic_sleep
ptz_sleep
ptz_y_angle
pwd
record_mode
refresh_ping
region_id
sd_leftsize
sd_size
speak_mode
start_with_reset
systick
tf_status.stat
time_sync
tmp_pwd
tnp_info.tnp_did
tnp_info.tnp_init_string
tnp_info.tnp_license
ts
version
video_backup_info.backup_period
video_backup_info.enable
video_backup_info.extra_sd_cam_used_size
video_backup_info.extra_sd_free_size
video_backup_info.extra_sd_total_size
video_backup_info.resolution
video_backup_info.router_sd_cam_used_size
video_backup_info.router_sd_free_size
video_backup_info.router_sd_total_size
video_backup_info.user_path
video_occlusion
viewpoint_trace
voice_ctrl
water_mark_enable
white_led_mode
white_light_alarm_time
wifi_connected
wifi_mode

With all this info we should be able to have all the functionality available on the cloud app within the local web interface.

Oh yeah the camera also definitely uploads logs to China. That combined with the p2p vulnerabilities is a good idea to disable the cloud features.

roleoroleo commented 2 years ago

This could be a new start, especially if we are able to intercept the stream. Thanks for sharing it.

roleoroleo commented 2 years ago

Some define:

#define DISPATCH_SET_POWER_ON 0x74
#define DISPATCH_SET_POWER_OFF 0x75
#define DISPATCH_SET_LIGHT_ON 0x76
#define DISPATCH_SET_LIGHT_OFF 0x77
#define DISPATCH_SET_MOTION_RCD 0x78
#define DISPATCH_SET_ALWAYS_RCD 0x79
#define DISPATCH_SET_MIRROR_ON 0x7a
#define DISPATCH_SET_MIRROR_OFF 0x7b
#define DISPATCH_SET_TNP_INIT_STATUS 0x7f
#define DISPATCH_P2P_CONNECTTED 0xe1
#define DISPATCH_P2P_DISCONNECTTED 0xe2
#define DISPATCH_P2P_VIEWING 0xe3
#define DISPATCH_P2P_STOP_VIEWING 0xe4
#define DISPATCH_P2P_CLR_VIEWING 0xe5
#define DISPATCH_SET_TNP_WORK_MODE 0x80
#define DISPATCH_SYNC_INFO_FROM_SERVER 0x89
#define RMM_SET_MOTION_DETECT 0x1023
#define RMM_SET_DAY_NIGHT_MODE 0x1024
#define RMM_SET_MOTION_SENSITIVITY 0x1027
#define RMM_SET_LDC 0x1028
#define RMM_SET_BABY_CRY 0x1029
#define RMM_SET_MIC_VOLUME 0x102a
#define RMM_SET_ENCODE_MODE 0x1032
#define RMM_SET_HIGH_RESOLUTION 0x1033

roleoroleo commented 2 years ago

I tried to send RMM_SET_ENCODE_MODE and it works:

value 1 = avc
value 2 = hevc

74ls04 commented 2 years ago

That's great, thanks! I just started mapping out mmap.info -- I copied it over to my dev machine and created a simple Python script to manually check the offsets.

import mmap

fd = open("./mmap.info", mode="r", encoding="utf8")

mmap_pointer = mmap.mmap(fd.fileno(), length=0, access=mmap.ACCESS_READ)

def read_offset(offset, bytes, is_int=False):
    mmap_pointer.seek(offset)
    val = mmap_pointer.read(bytes)

    if is_int:
        return int.from_bytes(val, "little")
    else:
        return val.decode()

                            # offset, bytes, is_int
MMAP_UUID =                 (0x230, 20, False)
MMAP_PASSWORD =             (0x50C, 15, False)  # Add '0' to get 16 character AES encryption key
MMAP_PROTOCOL =             (0x1CC, 4, False)
MMAP_ENCODE_TYPE =          (0x4E4, 1, True)
MMAP_KEY =                  (0x290, 16, False) # Used for p2p comms I believe 
MMAP_DEBUG_MODE =           (0x86C, 1, True)
MMAP_POWER =                (0x5e0, 1, True) # ??
MMAP_IN_PACKET_L0SS =       (0x570, 4, True)
MMAP_OUT_PACKET_L0SS =      (0x574, 4, True)
MMAP_ABNORMAL_SOUND_EN =    (0x4d0, 1, True)
# MMAP_ABNORMAL_SOUND_SENS =  (0x4d4, 4, True) # Bytes?
# MMAP_BABY_CRY_EN =          (0x4c8, 1, True)

# MMAP_TEST =                 (0x580, 4, True)
# print("Testing...: ", read_offset(MMAP_TEST[0], MMAP_TEST[1], MMAP_TEST[2]))
# print(" ")

print("UUID: ", read_offset(MMAP_UUID[0], MMAP_UUID[1], MMAP_UUID[2]))
print("Password: ", read_offset(MMAP_PASSWORD[0], MMAP_PASSWORD[1], MMAP_PASSWORD[2]))
print("Protocol: ", read_offset(MMAP_PROTOCOL[0], MMAP_PROTOCOL[1], MMAP_PROTOCOL[2]))
print("Encode type: ", read_offset(MMAP_ENCODE_TYPE[0], MMAP_ENCODE_TYPE[1], MMAP_ENCODE_TYPE[2]))
print("Key: ", read_offset(MMAP_KEY[0], MMAP_KEY[1], MMAP_KEY[2]))
print("Debug mode: ", read_offset(MMAP_DEBUG_MODE[0], MMAP_DEBUG_MODE[1], MMAP_DEBUG_MODE[2]))
print("Power off?: ", read_offset(MMAP_POWER[0], MMAP_POWER[1], MMAP_POWER[2]))
print("In packet loss: ", read_offset(MMAP_IN_PACKET_L0SS[0], MMAP_IN_PACKET_L0SS[1], MMAP_IN_PACKET_L0SS[2]))
print("Out packet loss: ", read_offset(MMAP_OUT_PACKET_L0SS[0], MMAP_OUT_PACKET_L0SS[1], MMAP_OUT_PACKET_L0SS[2]))
print("Abnormal sound enable: ", read_offset(MMAP_ABNORMAL_SOUND_EN[0], MMAP_ABNORMAL_SOUND_EN[1], MMAP_ABNORMAL_SOUND_EN[2]))
# print("Abnormal sound sensitivity: ", read_offset(MMAP_ABNORMAL_SOUND_SENS[0], MMAP_ABNORMAL_SOUND_SENS[1], MMAP_ABNORMAL_SOUND_SENS[2]))
# print("Baby cry enable: ", read_offset(MMAP_BABY_CRY_EN[0], MMAP_BABY_CRY_EN[1], MMAP_BABY_CRY_EN[2]))

So far so good! I'll keep updating this.

Also I found the source for an older version of the app and between that and Ghidra I've figured out how to identify the video stream network traffic and the meanings of the packet headers (i-frame vs p-frame). It generally sends an i-frame at the very beginning and then p-frames for the rest of the stream. Only the i-frame is encrypted but I know how to get the key from the camera (changes whenever p2p_tnp starts I believe) and I think I was able to decrypt one but I I have no way of verifying since I suck at h264 stuff and I would need to re-assemble the packets. I'll post some code later but it's fairly low priority for me right now. Also I think I should be able to figure out how to get it to start streaming the video without a client from the app.

roleoroleo commented 2 years ago

Good news! If we could have the stream with this methid probably the quality would be better.

roleoroleo commented 2 years ago

There are several possible approaches to obtain the stream using the code you found. 1 - Create a client app that communicates with p2p_tnp like the android app. 2 - Read fshare_frame_buf using the same method used by p2p_tnp. 3 - ...

I tried to follow the 2nd way but I stopped because the intersting functions (fshare_set_read_pos, fshare_must_read, etc...) are not contained in p2p_tnp.c. They are probably contained inside a library (see -lframeshare in the Makefile). This is different in our cam, where the library is statically linked.

EDIT

About the 1st way, I can't find the SDK (PPPP_api SDK). Only libraries are available: static, dynamic, macos, raspberry... But no sources.

74ls04 commented 2 years ago

I made good progress into figuring out those fshare_ functions by using the variable names from p2p_tnp.c and then following them through but it's hard without knowing all the data structures. Option 1 might be the best for a high quality stream.

I also couldn't find the PPPP_API SDK, I don't think it's anywhere on the internet!

So I think I've finally decrypted the stream... Can you try to parse this and tell me if it's a valid start to an i-frame? (edited to add some missing leading zeros at the beginning) 00000001674d0014965405017bcb37010101020000000168ee3c80000000016588804000

roleoroleo commented 2 years ago

SPS (67) and PPS (68) are ok. IDR (65) starts correctly but it's truncated. 640x360

74ls04 commented 2 years ago

Great! Glad the decryption works. I think I'll hold off on posting the steps until we've figured out if we can get the stream going without the app or can't improve the frame_buf algorithm anymore just so the decryption knowledge isn't needlessly in the wild if we don't end up using it. Let me know if there any specifics that would be helpful at improving the frame_buf algorithm -- I've now become quite familiar with the disassembled code but I can only go so far since I don't know the technical aspects of the video stuff well.

Regarding the mmap.info work, I've figured out most of the useful offsets and also started porting the corresponding "p2p_set" functions. These functions send the commands then read mmap.info to verify that the change happened. This code should get us most of the way there but I haven't tested it because it's not yet done.

What needs to be done is replacing all the calls to "g_p2ptnp_info.mmap_info->" with a function call that includes the proper offset and number of bytes to read. For the strings I have the number of bytes in inline comments but for the rest it can be easily worked out from the offset values, especially if it's just an on/off flag. I've also included functions that load mmap.info and another that can read bytes from it (in theory, I haven't tested it).

Hopefully this is all helpful. I won't be able to work on this for a bit due to other commitments but I'll be keeping an eye on it.

cmd_srv.c.txt cmd_srv.h.txt

roleoroleo commented 2 years ago

Great! Glad the decryption works. I think I'll hold off on posting the steps until we've figured out if we can get the stream going without the app or can't improve the frame_buf algorithm anymore just so the decryption knowledge isn't needlessly in the wild if we don't end up using it. Let me know if there any specifics that would be helpful at improving the frame_buf algorithm -- I've now become quite familiar with the disassembled code but I can only go so far since I don't know the technical aspects of the video stuff well.

I have no ideas on how to improve decoding of the fshare_frame_buf file. I realize that the overall quality of the rtsp stream can be improved but it is not clear to me where the problem is. If I extract the frames from the buffer and save them on a file I have no errors.

Regarding the mmap.info work, I've figured out most of the useful offsets and also started porting the corresponding "p2p_set" functions. These functions send the commands then read mmap.info to verify that the change happened. This code should get us most of the way there but I haven't tested it because it's not yet done.

I also tried to reverse the file and rewrote some code. We should check if the file is the same for all models.

What needs to be done is replacing all the calls to "g_p2ptnp_info.mmap_info->" with a function call that includes the proper offset and number of bytes to read. For the strings I have the number of bytes in inline comments but for the rest it can be easily worked out from the offset values, especially if it's just an on/off flag. I've also included functions that load mmap.info and another that can read bytes from it (in theory, I haven't tested it).

Hopefully this is all helpful. I won't be able to work on this for a bit due to other commitments but I'll be keeping an eye on it.

I hope to have some time to work on it.

roleoroleo commented 2 years ago

Do you know what's the meaning of set_motion_detect() and set_high_resolution()?

roleoroleo commented 2 years ago

My first results of this study.

1 - 02 00 00 00 08 00 00 00 07 10 01 00 0C 00 00 00 2F 74 6D 70 2F 63 61 70 2E 6A 70 67

immagine It takes a snapshot in low res

2 - 10 00 00 00 02 00 00 00 00 20 01 00 44 00 00 00 2F 74 6D 70 2F 63 61 70 2E 6D 70 34 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 AB 59 54 62

immagine It starts a 6s video capture. I don't know what's the meaning of the timestamp at the end of the message. Changing 00 20 with 01 20 at offset 8 the video lasts 10s.

74ls04 commented 2 years ago

Do you know what's the meaning of set_motion_detect() and set_high_resolution()?

set_motion_detect() sets the rectangular motion detection mask. I attempted to recreate the struct. My current assumption based on variable names is:

left = top left x
top = top left y
right = bottom right x
bottom = bottom right y

I haven't had a chance to check with the app. I think set_high_resolution() is for changing the resolution of the streaming video.

74ls04 commented 2 years ago

My first results of this study.

1 - 02 00 00 00 08 00 00 00 07 10 01 00 0C 00 00 00 2F 74 6D 70 2F 63 61 70 2E 6A 70 67

It takes a snapshot in low res

2 - 10 00 00 00 02 00 00 00 00 20 01 00 44 00 00 00 2F 74 6D 70 2F 63 61 70 2E 6D 70 34 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 AB 59 54 62

It starts a 6s video capture. I don't know what's the meaning of the timestamp at the end of the message. Changing 00 20 with 01 20 at offset 8 the video lasts 10s.

This is great! How did you capture the messages?

I was able to find the receiving function for that second message in mp4record.c (which is id 0x10). The name is RCD_START_SHORT_VIDEO and

0x2000 = 6 seconds
0x2001 = 10 seconds
0x2002 = 15 seconds

It didn't seem to use any other data from the message.

roleoroleo commented 2 years ago

I haven't had a chance to check with the app. I think set_high_resolution() is for changing the resolution of the streaming video.

I tried to send this command but nothing changed.

This is great! How did you capture the messages?

I didn't capture it, I reversed cloud application. See cloud_make_video() and cloud_cap_pic() functions. I'm not sure, but probably these are the constants:

#define MID_RMM 0x2
#define MID_DISPATCH 0x4
#define MID_RCD 0x10

#define RMM_START_CAPTURE 0x1007
#define RCD_START_SHORT_VIDEO 0x2000
#define RCD_START_SHORT_VIDEO_10S 0x2001
#define RCD_START_VOICECMD_VIDEO 0x2002
#define RCD_START_SHORT_FACE_VIDEO 0x2003
#define RCD_START_SHORT_HUMAN_VIDEO 0x2004

typedef enum
{
    E_NORMAL_TYPE = 0,
    E_FACE_TYPE,
    E_HUMAN_TYPE,
    E_BUTT
} e_short_video_type;

About different RCDSTART defines, I tested them:

RCD_START_SHORT_VIDEO 0x2000 - 6s low res stream with audio
RCD_START_SHORT_VIDEO_10S 0x2001 - 10s low res stream with audio
RCD_START_VOICECMD_VIDEO 0x2002 - 15s low res stream with audio
RCD_START_SHORT_FACE_VIDEO 0x2003 - 6s low res stream with audio
RCD_START_SHORT_HUMAN_VIDEO 0x2004 - 6s low res stream with audio

Unfortunately I was unable to create a file in high quality. Neither image nor video.

74ls04 commented 2 years ago

p2p_tnp.c is #define MID_P2P 0x8 I believe.

Unfortunately I was unable to create a file in high quality. Neither image nor video.

Okay, that makes sense since it only uses these to send to the app so they save on bandwidth.

roleoroleo commented 2 years ago

I can confirm set_motion_detect() struct:

typedef struct
{
    int mode;
    int resolution;
    short left;
    short top;
    short right;
    short bottom;
} motion_rect_t;

But it's based on the real resolution of the sensor: 1280x720. Example in hex if you select the full window: 0100 0000 0500 0000 0000 0000 0005 d002

74ls04 commented 2 years ago

I can confirm set_motion_detect() struct:

That's great - good to know.

I've figured out how to put it into factory mode. I copied /tmp/wpa_supplicant.conf to /tmp/sd/Factory/wpa_supplicant.conf. I then restarted p2p_tnp by killing it and letting the watchdog restart it. This disabled the outgoing connection and made the camera fully local - the app sees it as offline.

By looking at the code this also disables the encryption on the video stream. When I queried the camera from Python using the P2P protocol I also noticed that it returned a completely different UID. I'm sure there other things it does but I haven't explored it much. I also have not rebooted the camera with that file in place so no idea what happens then.

Edit: I took a chance and rebooted and the factory mode was still there and the P2P connection still down.

roleoroleo commented 2 years ago

Interesting.

github-actions[bot] commented 7 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

roleoroleo / yi-hack-Allwinner

Full local control of camera possible with cloud disabled. #362