Closed g-rom-info closed 7 years ago
I've noticed that the LT8900 is a little flakey before it's sent a few packets, but I seem to be able to switch between RF configs without too much trouble afterwards.
Yesterday evening I've replaced my original Milight Bridge by my ESP8266 (same port, same protocol v5) I let it run all night. This morning, I was able to switch on a group of RGBW light, but after that it has not working anymore. After that, I've sent a command to my RGBCCT bulb and after that I can send again commands to my RGBW bulbs.
Havent seen this, try to always initialize with type == RGB_CCT in the initialize method by hardcoding the type instead of the config type. It uses captured settings from the remote.
Just tested it. It definitely seems spottier for RGBW bulbs with a LT8900. I'm not sure that there's any pattern other than a much higher failure rate. Seems to work fine with RGB+CCT. The same ESP works well with an NRF.
Perhaps this means some of the settings for non-RGB+CCT have suboptimal register values? That's set up here:
https://github.com/sidoh/esp8266_milight_hub/blob/master/lib/MiLight/LT8900MiLightRadio.cpp#L95
Maybe we could try setting it up with the same ones used for RGB+CCT, or sniff what the remote is using.
That was what I meant☺
Moreover, I also note that the ESP8266 sometimes loses its connection with the WiFi, as if the PL1167 module remained on and disturbed the WiFi signal. I have to disconnect the VCC power supply from the PL1167 to retrieve the IP link.
I have a CCT remote control and I have soldered wires in order to sniff the SPI communication. I use a Logic Analyzer based on Arduino Mega. I think it will be better to build directly an SPI sniffer with Arduino and send any captured packets by Serial Bus to a terminal but I havent sufficient dev skill...
Are you able to watch serial output on your ESP? Mine stays connected for weeks at a time and I use it frequently throughout the day. I have noticed that the LT8900 will occasionally prevent the ESP from booting cleanly. I think probably it's setting GPIO 15 high (which is one of the SPI pins), which would cause the ESP to try to boot from flash. It'll also occasionally cause a WDT reset.
I don't have a logic analyzer, so unfortunately wouldn't be able to sniff the values myself.
Yes I'm able to watch serial output: See below when I boot up.
*WM: AutoConnect
*WM: Connecting as wifi client...
*WM: Using last saved values, should be faster
*WM: Connection result:
*WM: 3
*WM: IP Address:
*WM: 192.168.18.12
On the same time, from my PC (french terminal):
Réponse de 192.168.16.1 : Impossible de joindre l’hôte de destination.
Réponse de 192.168.18.12 : octets=32 temps=2296 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=6 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=1 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=3 ms TTL=127
Délai d’attente de la demande dépassé.
Délai d’attente de la demande dépassé.
Délai d’attente de la demande dépassé.
Délai d’attente de la demande dépassé.
Ping comes up again when I unplug the PL1167. PKT is wired to D0 (GPIO16) and RST to D3 (GPIO0). 4 other pins to SPI BUS.
Regarding the non responding, perhaps it is related to timing issues. The interface may be too fast for the bulbs. Is it combined commands, or single commands that has problems?. If combining a power on with a level change then perhaps the level change is not seen by the bulb due to it being sent to fast. Until now i have used it with rgbcct bulbs and mainly with single commands. Here no problems have been seen.
I think your assumption is good. Indeed it seems that the lamps behave badly especially when the bridge is sending multiple commands in sequence.
However I confirm that I havent any issue at this time with my CCT Bulb.
I installed a dev environment on Debian 8 + Atom + PlatformIO I can test to change "on the fly" the sleep time between 2 commands. Which file I need to modify ?
The file would be milightclient, and the method
void MiLightClient::update(const JsonObject& request)
Try to find a combination of commands that fail, end then as a test insert delays between those calls in the above mentioned method.
Might make sense to put the delays in the LT8900 radio. I don't think the same thing happens with an NRF.
Thinking that'd be here? https://github.com/sidoh/esp8266_milight_hub/blob/master/lib/MiLight/LT8900MiLightRadio.cpp#L476
For implementation yes, just guided for a quick test.
I tried to force RGB_CCT initialization on the constructor, no effect.. I tried to increase the delay on "bool LT8900MiLightRadio::sendPacket(uint8_t *data, size_t packetSize, byte byChannel)" function, no effect...
I tested some modifications on the source code but I havent sufficient skill in order to find why my ESP8266 loses this wifi connection after LT8900 connection. The lose of connectivity is random and I recover connectivity when I shut down the PL1167 by unpluging the VCC wire.
I activated debug log and I see only that
Radio module running correctly...
Radio module running correctly...
Radio module running correctly...
Radio module running correctly...
Is it normal to see it 4 times ?
Moreover, as explained below, when I recover the connectivity (after about 5-10 minutes), it is impossible to send commands to my RGBW bulbs, I see commands on debug log
Sending packet: B0D24E00000200
Elapsed: 77
Sending packet: B0D24E00000101
Elapsed: 75
Sending packet: B0D24E00000202
Elapsed: 80
Sending packet: B0D24E00000203
Elapsed: 82
Sending packet: B0D24E00000204
Elapsed: 91
Sending packet: B0D24E00000105
Elapsed: 82
Sending packet: B0D24E00000106
Elapsed: 88
I need to switch between RGBW and an other mode, send a command, and switch back to RGBW.
Sending packet: 00DB8F6566CDBA65A9
Elapsed: 225
Sending packet: B0D24E00000107
Elapsed: 71
Sending packet: B0D24E00000208
Elapsed: 68
It is very strange.... I havent this issue with nRF24L01.
How can I help you in order to do more troubleshooting ?
Obviously you shouldn't be losing connectivity at all. This is probably happening because the firmware is stuck in some loop with a yield
in it. Said loop could be buried in an internal library (like SPI). Since the ESP has a single CPU core, the code to handle client requests aren't going to get handled, meaning it'll be unreachable.
It'd be helpful to know what part of the code it's stuck in. The most straightforward way to figure this out is to insert prints in various places and see which one is getting barfed last.
I suspect the RGBW issue is unrelated. I can confirm that the radio is much less reliable with RGBW (works perfectly well with RGB+CCT). I think WoodsterDK and I both believe this is because the register settings for non-RGB+CCT are wrong. Probably the best way to solve this is to sniff the register settings from an official remote with a logic analyzer.
I havent got my computer at the moment, Try these settings for non rgbcct bulbs:
regWrite16(0, 111, 224, 7); // Recommended value by PMmicro
regWrite16(1, 86, 129, 7); // Recommended value by PMmicro
regWrite16(2, 102, 23, 7); // Recommended value by PMmicro
regWrite16(4, 156, 201, 7); // Recommended value by PMmicro
regWrite16(5, 102, 55, 7); // Recommended value by PMmicro
regWrite16(7, 0, 76, 7); // PL1167's TX/RX Enable and Channel Register
regWrite16(8, 108, 144, 7); // Recommended value by PMmicro
regWrite16(9, 72, 0, 7); // PL1167's PA Control Register
regWrite16(10, 127, 253, 7); // Recommended value by PMmicro
regWrite16(11, 0, 8, 7); // PL1167's RSSI OFF Control Register -- ???
regWrite16(12, 0, 0, 7); // Recommended value by PMmicro
regWrite16(13, 72, 189, 7); // Recommended value by PMmicro
regWrite16(22, 0, 255, 7); // Recommended value by PMmicro
regWrite16(23, 128, 5, 7); // PL1167's VCO Calibration Enable Register
regWrite16(24, 0, 103, 7); // Recommended value by PMmicro
regWrite16(25, 22, 89, 7); // Recommended value by PMmicro
regWrite16(26, 25, 224, 7); // Recommended value by PMmicro
regWrite16(27, 19, 0, 7); // Recommended value by PMmicro
regWrite16(28, 24, 0, 7); // Recommended value by PMmicro
regWrite16(32, 72, 0, 7); // PL1167's Data Configure Register: LEN_PREAMBLE = 010 -> (0xAAAAAA) 3 bytes, LEN_SYNCWORD = 01 -> 32 bits, LEN_TRAILER = 000 -> (0x05) 4 bits, TYPE_PKT_DAT = 00 -> NRZ law data, TYPE_FEC = 00 -> No FEC
regWrite16(33, 63, 199, 7); // PL1167's Delay Time Control Register 0
regWrite16(34, 32, 0, 7); // PL1167's Delay Time Control Register 1
regWrite16(35, 3, 0, 7); // PL1167's Power Management and Miscellaneous Register
regWrite16(40, 68, 2, 7); // PL1167's FIFO and SYNCWORD Threshold Register
regWrite16(41, 176, 0, 7); // PL1167's Miscellaneous Register: CRC_ON = 1 -> ON, SCR_ON = 0 -> OFF, EN_PACK_LEN = 1 -> ON, FW_TERM_TX = 1 -> ON, AUTO_ACK = 0 -> OFF, PKT_LEVEL = 0 -> PKT active high, CRC_INIT_DAT = 0
regWrite16(42, 253, 176, 7); // PL1167's SCAN RSSI Register 0
regWrite16(43, 0, 15, 7); // PL1167's SCAN RSSI Register 1
delay(200);
regWrite16(128, 0, 0, 7);
regWrite16(129, 255, 255, 7);
regWrite16(130, 0, 0, 7);
regWrite16(132, 0, 0, 7);
regWrite16(133, 255, 255, 7);
regWrite16(135, 255, 255, 7);
regWrite16(136, 0, 0, 7);
regWrite16(137, 255, 255, 7);
regWrite16(138, 0, 0, 7);
regWrite16(139, 255, 255, 7);
regWrite16(140, 0, 0, 7);
regWrite16(141, 255, 255, 7);
regWrite16(150, 0, 0, 7);
regWrite16(151, 255, 255, 7);
regWrite16(152, 0, 0, 7);
regWrite16(153, 255, 255, 7);
regWrite16(154, 0, 0, 7);
regWrite16(155, 255, 255, 7);
regWrite16(156, 0, 0, 7);
regWrite16(160, 0, 0, 7);
regWrite16(161, 255, 255, 7);
regWrite16(162, 0, 0, 7);
regWrite16(163, 255, 255, 7);
regWrite16(168, 0, 0, 7);
regWrite16(169, 255, 255, 7);
regWrite16(170, 0, 0, 7);
regWrite16(171, 255, 255, 7);
regWrite16(7, 0, 0, 7); // Disable TX/RX and set radio channel to 0
Regarding being unresponsive, a timeout could be added here: while (digitalRead(_pin_pktflag) == 0) { //do nothing. } There is no yield or sleep here, so the result would be a lockup and then a wdt reset, and perhaps the lt module then prevents the esp for a proper boot depending on which pins are used.
I added serial output between this loop like this:
//Wait until the packet is sent.
#ifdef DEBUG_PRINTF
Serial.println(F("Begin of wait loop... "));
#endif
while (digitalRead(_pin_pktflag) == 0)
{
//do nothing.
}
#ifdef DEBUG_PRINTF
Serial.println(F("End of wait loop... "));
#endif
Result:
Sending packet: B0D24E00400E00
Begin of wait loop...
End of wait loop...
Begin of wait loop...
End of wait loop...
Begin of wait loop...
End of wait loop...
Begin of wait loop...
End of wait loop...
Begin of wait loop...
End of wait loop...
As a conclusion, the loop is not infinite
Two issues: 1) lockup of the WIFI 2) No control unless type of bulb is switched around.
But debug wise the output only makes sense, if the error occurred. Let's start dealing with number 2 - can you replicate a scenario that keeps failing, or is it periodic and without any pattern?
Just noticed the following: 'I tried to force RGB_CCT initialization on the constructor, no effect.. I tried to increase the delay on "bool LT8900MiLightRadio::sendPacket(uint8_t *data, size_t packetSize, byte byChannel)" function, no effect...'
Inserting a delay here won't make any big difference I think.... it makes the timing between the transmits longer, but not the timing two different commands.
When I was testing LT8900 with RGBW bulbs, it seemed like it was just unreliable in general. I'm not sure there was much of a pattern to it. Seemed like RGBW bulbs were just overall less responsive.
Also, @Lstt2005 mentioned in #70 that his LT8910 worked very well in an older revision (looked like 38f8fce169edb0e1f4923ccf6d7b24b30a42e910). From what I can tell there weren't any relevant changes between master and that revision, but might be worth trying.
Is there a list of commands that causes the problem.... multiple commands?
I just remembered this, and I'd want to test again to confirm, but it actually seemed like it was mostly turning the bulbs on. Other commands sent after the bulbs were on seemed reasonably reliable.
Could it be that the bulbs are in a low power sleep mode, when not turned on, and the current implementation sends a complete frame with retransmits too fast? Try to set DEFAULT_TIME_BETWEEN_RETRANSMISSIONS_uS to 350 as a test.
Yeah, this was basically what I was guessing. Thanks for pointing out the constant, I'll give that a try.
I tried RGBW bulbs with the LT8900 again last night, and had trouble reproducing. Turning on and off seemed to work just fine. :\
Okay, after a bit more testing, I've narrowed things down a bit:
Seems like this indicates the problem is with the register settings. I'll poke around and see if I can't find the differences.
Great that you found a pattern.
If you don't change bulb type the init values are not sent. So it sounds strange if it works after a couple of rgbcct frames. Regarding the rgbw ones, try change vInitRadioModule so that it both sends the rgbcct values first and then the rgbw ones in one pass.
Haha, I think I figured it out:
Basically the RGBW radio was set to the default radio before other radios were initialized. MiLightClient then thought RGBW had already been initialized, but its settings were overwritten with other radios' settings, so they didn't stick until one switched away and then back.
@g-rom-info can you verify that it's fixed and close if it looks good?
Yes it's OK the RGBW works correctly after a cold start :)
However, I still have instabilities on IP connectivity. After a cold start, the ping is high or the ESP no longer responds intermittently. The web page may not be displayed and receipt of UDP commands not processed.
Réponse de 192.168.18.12 : octets=32 temps=710 ms TTL=127
Délai d’attente de la demande dépassé.
Délai d’attente de la demande dépassé.
Délai d’attente de la demande dépassé.
Réponse de 192.168.18.12 : octets=32 temps=398 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=102 ms TTL=127
Délai d’attente de la demande dépassé.
Délai d’attente de la demande dépassé.
Délai d’attente de la demande dépassé.
Réponse de 192.168.18.12 : octets=32 temps=110 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=151 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=8 ms TTL=127
Délai d’attente de la demande dépassé.
Réponse de 192.168.18.12 : octets=32 temps=50 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=146 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=9 ms TTL=127
Délai d’attente de la demande dépassé.
Réponse de 192.168.18.12 : octets=32 temps=91 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=81 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=36 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=3 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=52 ms TTL=127
After sending the first radio packets, the ping appears to stabilize.
Réponse de 192.168.18.12 : octets=32 temps=9 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=9 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=6 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=8 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=8 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=8 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=6 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=1 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=7 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=1 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=3 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=2 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=1 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=8 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=2 ms TTL=127
Réponse de 192.168.18.12 : octets=32 temps=5 ms TTL=127
I bought a second ESP connected simultaneously to the same WiFi network but connected to my nRF24L01 and I have no problem.
Réponse de 192.168.18.13 : octets=32 temps<1ms TTL=127
Réponse de 192.168.18.13 : octets=32 temps=3 ms TTL=127
Réponse de 192.168.18.13 : octets=32 temps<1ms TTL=127
Réponse de 192.168.18.13 : octets=32 temps=4 ms TTL=127
Réponse de 192.168.18.13 : octets=32 temps=1 ms TTL=127
Réponse de 192.168.18.13 : octets=32 temps<1ms TTL=127
Réponse de 192.168.18.13 : octets=32 temps=1 ms TTL=127
Réponse de 192.168.18.13 : octets=32 temps=1 ms TTL=127
Réponse de 192.168.18.13 : octets=32 temps=13 ms TTL=127
Réponse de 192.168.18.13 : octets=32 temps=6 ms TTL=127
Réponse de 192.168.18.13 : octets=32 temps=8 ms TTL=127
Réponse de 192.168.18.13 : octets=32 temps=2 ms TTL=127
Réponse de 192.168.18.13 : octets=32 temps=1 ms TTL=127
Réponse de 192.168.18.13 : octets=32 temps=6 ms TTL=127
After a warm start (by pressing the reset button), the ESP seems to keeps his stability.
I suspect an initialization problem however I'm sorry but I havent time this weekend to investigate more (I go away with my girlfriend :-) ) so I will do more tests next monday evening.
Interesting that there aren't problems with the other ESP. Maybe one of the traces on your first one is bad?
Have a good weekend! :)
@g-rom-info can you confirm that this is fixed for you?
@sidoh I need to make further tests. Can you wait until next sunday ?
Sure thing. No rush :)
(Out of topic - in case of you have not been notified : #65 issue still here with 1.2.3 release :-( )
@g-rom-info were you able to verify that this is fixed?
@sidoh Sorry, I was on holidays last week and had hard work last 3 weeks, as a consequence I dit not have sufficient time for my tests :( I'm available this afternoon I will make my tests and send you a feedback this evening ;-)
@sidoh I compiled the last main branch from github and injected it to my ESP8266 with PL1167. I also injected the file system in order to have a fresh install. Finaly I exported and imported my production environment configuration from my ESP8266/nRF24, coupled with domoticz. First of all, the configuration is stable : no ping loss, latency seems to be OK. After some commands, it seems to be OK with my RGBW and RGBCCT bulbs. I switched my domoticz configuration in order to use this bridge. Next wednesday, I will give you my feedback after this live experimentation. ;-)
Sounds good! Thanks :)
My bridge with PL1167 works perfectly since 4 days with 1.3.0 I've proceeded to multiple cold starts : the ESP is stable after a few seconds. The ping is quite stable but less stable comparing with nRF24l01. However all UDP/HTTP commands are processed correctly. I suspect more CPU activity with PL1167/LT8900.
I don't understand why last month I have multiple connectivity loss with 1.2.2, however I think we can consider that the issue is solved.
Thanks for testing! Glad to hear it's working better. Feel free to re-open if you notice anything else.
Dear Chris,
I can reproduce an issue : after a cold start, or after several minutes, i cannot control my Milight RGBW bulbs. (as same as Domoticz and Bridge WebUI) However, when I send a command to RGB+CCT channel, I can send again commands to my RGBW bulbs.
Transceiver : PL1167
Do you have the same issue ?
Regards, Jerome