pvvx / THB2

Custom firmware for Tuya devices on the PHY622x2 chipset
Other
219 stars 19 forks source link

radio BLE issues with all THB1 #74

Open universam1 opened 3 weeks ago

universam1 commented 3 weeks ago

Amazing work with this project @pvvx - really appreciating and exiting!

I bought 6x THB01 and all show the same connection issues. Right from the first BOOT flash it is almost impossible (maybe 1 of 30 attempts) to connect to the devices to try OTA. So I went ahead with UART flashing .... However, the BLR connection is so extremly fragile, it disconnects after few seconds. OTA is basically impossible, I was only once (from 100x tries or so) able to do, trying to downgrade to 1.7 FW (if that solves the issue but no).

Tried with 2 different Windows Laptops and 2 different Android devices, tablet and phone.

One day later all devices stopped to announce via BLE now! Even with NRF connnect I can only guess by the RSSI which devices they are, connection impossible. I tried power unplug, oder batteries and of course pressing the button.

I sounds to me the PLL for the Radio might have an issue.

Any idea please, is it possible that the BLE setup is broken

universam1 commented 3 weeks ago

wonder if this could be related: https://github.com/pvvx/THB2/issues/38#issuecomment-2040327261

universam1 commented 3 weeks ago

Some more feedback, that makes a huge difference for the majority of the devices. Around 80% work now after clearing the whole flash once:

Try writing with erasing the entire flash (option -a): python3 rdwr_phy62x2.py -p COMxx -a -r wh BOOT_xxx_vxx.hex Areas protected from normal erase may contain some settings that are incorrect (RF, ADC).

Is that something that helps to solve an underlying problem with tunings, maybe for the remaining devices where this does not help?

pvvx commented 2 weeks ago

THB01 - what is this variant?

On some PHY6222 series chips, the RC oscillator is unstable. The BLE connection stabilizes when the supply voltage drops to 2.5 V. Or over time, when the RC oscillator drift correction, calculated relative to the crystal oscillator, accumulates.

universam1 commented 2 weeks ago

Not sure if this photo helps much IMG_20241028_171249

In this case the drift correction went from bad to completely broken, if you wish

pvvx commented 2 weeks ago

This is THB1. THB1 work fine until battery voltage drops to 1.9V.

image

Replaced the battery with a new one. image

Checked OTA:


19:46:18: Ожидание соединения с THB1-4ACDD9 19:46:26: Model: THB1 19:46:26: Firmware: github.com/pvvx 19:46:26: Hardware: 0017 19:46:26: Software: V1.8 19:46:26: Device info # hw: 0017, sw: 0018, services: 000043B8, sd: 0000 19:46:26: Устройство подключено. 19:46:30: Загрузка firmware файла 'THB1_v18.bin'... 19:46:30: Файл: bin/THB1_v18.bin 19:46:30: Файл id:PHY6, Сегментов: 3, Старт: 0x"1FFF1838, : 50096 19:46:30: Размер файла: 50100 байт 19:46:30: Счетчик: 3132 блоков 19:46:32: Переключение на... 19:46:32: Переподключение 19:46:32: Устройство отключено. 19:46:32: Ожидание соединения с THB1-4ACDD9 19:46:35: Устройство отключено. 19:46:35: Ожидание соединения с THB1-4ACDD9 19:46:39: Model: THB1 19:46:39: Firmware: github.com/pvvx 19:46:39: Hardware: 0017 19:46:39: Software: B1.4 19:46:39: Device info # hw: 0017, sw: 0014, services: 000002A1, sd: 0000 19:46:40: OTA ver: 01 19:46:40: Устройство подключено. 19:46:42: Старт программирования... 19:47:35: Программирование завершено за 52.184 секунды 19:47:39: Устройство отключено.


Everything is fine.

What is problem?

pvvx commented 2 weeks ago

Reprogramming BOOT v1.4 -> v1.8.


20:03:13: Ожидание соединения с THB1-4ACDD9 20:03:19: Model: THB1 20:03:19: Firmware: github.com/pvvx 20:03:19: Hardware: 0017 20:03:19: Software: V1.8 20:03:20: Device info # hw: 0017, sw: 0018, services: 000043B8, sd: 0000 20:03:20: Устройство подключено. 20:03:25: Загрузка firmware файла 'BOOT_THB1_v18.bin'... 20:03:25: Файл: update_boot/BOOT_THB1_v18.bin 20:03:25: Файл id:PHY6, Сегментов: 1, Старт: 0x"1FFF1838, : 49968 20:03:25: Размер файла: 49972 байт 20:03:25: Счетчик: 3124 блоков 20:03:27: Переключение на... 20:03:27: Переподключение 20:03:27: Устройство отключено. 20:03:27: Ожидание соединения с THB1-4ACDD9 20:03:30: Устройство отключено. 20:03:30: Ожидание соединения с THB1-4ACDD9 20:03:37: Model: THB1 20:03:37: Firmware: github.com/pvvx 20:03:37: Hardware: 0017 20:03:37: Software: B1.4 20:03:38: Device info # hw: 0017, sw: 0014, services: 000002A1, sd: 0000 20:03:38: OTA ver: 01 20:03:38: Устройство подключено. 20:03:40: Старт программирования... 20:04:33: Программирование завершено за 51.937 секунды 20:04:36: Устройство отключено.


Connect BOOT v1.8 and re-OTA THB1_v18.


20:07:20: Ожидание соединения с THB1-4ACDD9 20:07:25: Model: THB1 20:07:25: Firmware: github.com/pvvx 20:07:25: Hardware: 0017 20:07:25: Software: B1.8 20:07:25: Device info # hw: 0017, sw: 0018, services: 000002A1, sd: 0000 20:07:25: OTA ver: 01 20:07:25: Устройство подключено. 20:07:37: Загрузка firmware файла 'THB1_v18.bin'... 20:07:38: Файл: bin/THB1_v18.bin 20:07:38: Файл id:PHY6, Сегментов: 3, Старт: 0x"1FFF1838, : 50096 20:07:38: Размер файла: 50100 байт 20:07:38: Счетчик: 3132 блоков 20:07:44: Старт программирования... 20:08:38: Программирование завершено за 52.122 секунды 20:08:41: Устройство отключено.


The tested thermometers on PHY6222 all work stably. Nothing disappears in HA BTHome.

image

universam1 commented 2 weeks ago

Thank you for checking with your devices! That proves that the code is working. And after clearing all flash with -a flag, most devices I have can be connected. However some still can not be connected, or the connection is interrupted as soon as data is sent. With nrf connect Android app I can reproduce that. Might be a hardware problem, I don't know if something can be done for that though?

pvvx commented 2 weeks ago

Under the Tuya brand, devices are produced from various junk. Tuya devices do not have any guarantees or certificates.

The PHY documentation mentions the capacitance in the RF part power supply for RF stability. It is not known what they put in the device assembled from junk. It is also not known what PHY chips were used. They may be from a defective batch from the factory...

I bought devices with PHY from different suppliers for testing. Only a couple, where the PHY6222 chip had a different marking from the others, worked unstably. Because of them, I had to rewrite the SDK functions for correcting the RC clock to my own, having extended ranges and an algorithm.

According to other users, there were only 2 chips that had problems with the RC generator. These users simply threw them away - there is no point in messing around with defective chips.

The PHY6222 chip itself has several unfixed problems. One is an unstable RC oscillator, with a departure that does not comply with the BLE standard, another problem is in the delays of the radio frequency part... But the largest number of errors are in the program ROM. That's why it is cheap.

Tuya partially checks only modules with PHY6222, which it sells as OEM. But these thermometers are assembled by small offices without any checks. You can register in Tuya yourself and sell any junk.

It may also turn out that you have chips with a different revision of the ROM code... But there is a high probability that you are doing something wrong.

universam1 commented 2 weeks ago

I see your point that wasting time on cheap hardware doesnt pay off - however before I performed the -a full erase none of those devices did work. Now at least 80% work as expected. So I don't see what I should do wrong here.

Here is an example of a problematic device. Those devices light up the bluetooth icon for about 1s then the connection is dropped. No stable connection possible.

...\src\THB2> python rdwr_phy62x2.py -p COM5 -a -r wh bin\BOOT_THB1_v18.hex
=========================================================
PHY62x2 Utility version 11.03.24
---------------------------------------------------------
Connecting...
PHY62x2 - Reset Ok
Revision: b'001364c8 6222M005'
FlashID: 1364c8, size: 512 kbytes
PHY62x2 - connected Ok
---- Segments Table -------------------------------------
Segment: 11003000 <- Flash addr: 00003000, Size: 00008f6c
Segment: 1fff0000 <- Flash addr: 0000bf6c, Size: 00000400
Segment: 1fff1838 <- Flash addr: 0000c36c, Size: 00002bd6
----------------------------------------------------------
Erase All Chip Flash... ok
Segment Table[03] <- Flash addr: 00002000, Size: 00000130
Write 0x00000130 bytes to Flash at 0x00002000... ok
Segment: 11003000 <- Flash addr: 00003000, Size: 00008f6c
Write 0x00002000 bytes to Flash at 0x00003000... ok
Write 0x00002000 bytes to Flash at 0x00005000... ok
Write 0x00002000 bytes to Flash at 0x00007000... ok
Write 0x00002000 bytes to Flash at 0x00009000... ok
Write 0x00000f6c bytes to Flash at 0x0000b000... ok
Segment: 1fff0000 <- Flash addr: 0000bf6c, Size: 00000400
Write 0x00000400 bytes to Flash at 0x0000bf6c... ok
Segment: 1fff1838 <- Flash addr: 0000c36c, Size: 00002bd6
Write 0x00002000 bytes to Flash at 0x0000c36c... ok
Write 0x00000bd6 bytes to Flash at 0x0000e36c... ok
----------------------------------------------------------
Write Flash from file: bin\BOOT_THB1_v18.hex - ok.
Send command 'reset' - ok
xyzzy42 commented 2 weeks ago

Are you using the same batteries in each device? Seems like it's more sensitive to supply voltage than it should be, so batteries that happen to have a different voltage might be the cause of a problem in one device rather than the device itself.

universam1 commented 2 weeks ago

This is powered at 3.3V via USB. Also tried lower voltages of 2.5V or even batteries. Same behaviour.

xyzzy42 commented 2 weeks ago

When I powered a BTH01 from the USB serial adapter it did not work reliably. Interval between BLE beacons was inconsistent. Attempting to connect would rarely succeed. And once connected it would last a most a few seconds before being disconnected.

Just removing all soldered wires and changing to batteries (NiMH now at 2.65 V) has fixed it. Interval between received beacons is now a multiple of 5 seconds. Once connected it stays that way for hours. Connecting is not perfectly reliable, but I think this is caused by the long connection interval with a too short timeout on the host.

universam1 commented 2 weeks ago

Interesting! So you are saying reducing the default setting of the interval should improve the likelihood for a successfull connection? Now unfortunately we have a chicken egg problem, that I cannot change the defaults without an initial connection... @pvvx would it be possible to fix comipile into the boot firmware a short interval and longer timeout?

pvvx commented 2 weeks ago

README


Для подключения в Linux необходимо установить стандартные интервалы (Bluetooth v4.0..5.4) в файлах конфигурации Bluez. By default, Bluez for Bluetooth has intervals invented by the assemblers that do not correspond to the standard.

xyzzy42 commented 2 weeks ago

Yes, button works for me. With normal CI, I can connect with Nordic Connect app on Android, bluetoothctl on Linux, and Chrome on Android. But Chrome on Linux will always fail. And the ones that work are not 100%. Using the button to reduce CI allow Chrome on Linux to connect to. And success rate seems to be nearly 100% on other devices.

This is with batteries. Same device on USB serial cable power was nearly useless.

There are lots of reasons that power from an USB-serial cable could cause problems beyond voltage. Batteries provide a very clean low noise supply. Cheap USB serial cable does not at all.

A good device will have a well designed power supply filter caps and work on less than perfect power from the USB serial adapter. But will a Tuya device have extra capacitors that shouldn't be absolutely necessary on battery power? Of course not.

pvvx commented 2 weeks ago

There is no stability from the battery under pulse load. For example, a half-discharged CR2032 has an internal resistance of 100 Ohm. The current of the BLE device changes from 2 μA to 8 mA. Accordingly, the supply voltage jumps by 0.8 V (3.0V -> 2.2 V).

universam1 commented 2 weeks ago

The connection is stable for ~1s before it drops, visible via the icon. Could it be that the 900ms(?) timeout is playing a role here?

pvvx commented 2 weeks ago

The RC oscillator is corrected with a drift of 20%. Large deviations are not taken into account in the average deviation.

The Bluetooth Core Specifications Version 5.0+ require the active clock to have a drift less than or equal to ±50 ppm. The Bluetooth Core Specifications Version 5.3 require the sleep clock (RC-oscillator) to have a drift less than or equal to ±500 ppm.

It is impossible to get such accuracy in PHY62xx chips from RC oscillator.

pvvx commented 2 weeks ago

Could it be that the 900ms(?) timeout is playing a role here?

This is more related to the constants of different intervals in the SDK and the receiving adapter. Each version of the PHY SDK contains different timing constants. The reason is the inaccuracy of the clock. This is more related to the constants of different intervals in the SDK and the receiving adapter. Each version of the PHY SDK contains different synchronization constants. The reason is the inaccuracy of the clock, with a large dependence on the batch of chips. It is impossible to correct these deviations.

If you select coefficients for your chip, then such correction will not work on others. And on your chip, when changing the supply voltage, you will have to rebuild the SDK with new constants again.

https://github.com/pvvx/THB2/blob/master/bthome_phy6222/SDK/lib/rf/patch.c#L7716-L7909

These constants are applied as a patch to the program in the ROM of the chip, since in ROM they are also incorrect. They depend on everything. On CPU frequency, clock frequencies of other counters, RF part delays, internal voltages, RC generator drift, supply voltage and temperature. This chip is PHY6222.

Each time the chip wakes up from sleep, the RC stroke to the quartz generator is calculated and the RC generator count coefficient is dynamically corrected. But when the chip is asleep, the supply voltage is 3.0 V, and when waking up from sleep, it is 1.9..2.8 V depending on the battery state. And in different chip modes, sleep or active operation, the internal supply voltages of the units change to achieve minimum consumption during sleep. In this case, the RC generator changes its stroke significantly.

The RC generator correction procedure from any РHY SDK failed to cope with its task. Half of the purchased Tuya devices on PHY6222 did not maintain communication at all at a supply voltage of more than 2.4V. I had to change (increase) the RC generator spread thresholds and replace the correction algorithms. Only then did all the PHY6222 devices that I had at least somehow maintain communication at supply voltages of more than 2.5V.

I don't want to waste any more time digging around and fixing faulty chips. I had to dig into the comparison of power supplies for microcircuits on DC-DC pins and much more.

BTH01Y image

TH05-v1.3 image

Such instability of the main power supply of the chip (and DC-DC work) after sleep during the transition to activity... ...

You can do it yourself if you have a lot of extra time.