cnlohr / esp82xx

Useful ESP8266 C Environment
Other
288 stars 107 forks source link

Random charactes on serial with ESP not booting #53

Closed con-f-use closed 7 years ago

con-f-use commented 7 years ago

With the latest version of esp82xx I'm getting something strange on fresh esp-12-E modules

When I do the following on a fresh esp-12E

git clone --recursive https://github.com/cnlohr/esp82xx.git  # latest commit 93074f
cp esp82xx/Makefile.example Makefile
make project
make burn
make burnweb

I only get an endless sequence of seemingly random character sequences on the UART and the ESP doesn't connect to WiFi or respond to anything but putting it in flashing mode.

When I flash an old version of esp82xx, set WiFi credentials and then flash the latest version again, everything works. Did @cnlohr experience something similar with the latest commit?

I suspect if that happens to new users, they might get thoroughly confused.

P.S. I tried multiple ESPs with different programmers.

cnlohr commented 7 years ago

I don't believe so. Or, at least I JUST started using it on esp8266lighthouse, you may want to try checking that out and seeing if it endless-crashes for you. One thing I find dubious is the continuous flashing of the initdata. https://github.com/cnlohr/esp8266lighthouse

cnlohr commented 7 years ago

I have, however, noticed this with some other hardware. I thought the hardware was the issue. I may be wrong.

con-f-use commented 7 years ago

At first I thought it hat something to do with loading the WiFi credentials. Something read from a virgin flash with random data. But I don't think so. The only code stuff we have changed was a) the sdk b) the rf_cal routine c) strcat. Or can you think of more changes, we made recently?

One thing I find dubious is the continuous flashing of the initdata.

Not sure what you mean here.

cnlohr commented 7 years ago

dubious = I was just working with the project "esp8266lighthouse" last night and I believe using the newest esp82xx

con-f-use commented 7 years ago

Yes you were, I saw the lifestream, but missed what you describe as "continuous flashing of the initdata". Talking of continuous flashing, the LED blinks like crazy (constantly resetting?) when this bug occurs.

cnlohr commented 7 years ago

I have definitely seen it. I will try to make time to burn with stock firmware tonight. Can you try burning the esp8266lighthouse code and see if you constantly reset?

con-f-use commented 7 years ago

I don't have any "fresh" modules with the factory-default AT-firmware on them anymore. I've only seen the bug on them. So unfortunately no. To reproduce the bug I even tried to reflash the AT-firmware to no avail. Once the WiFi settings are changes, the modules are fine and stop resetting regardless of the firmware.

cnlohr commented 7 years ago

Ok, now I am reallllyy confused by what you're saying. You're saying you have busted modules that seem unable to work EVER or you do something and they work fine?

con-f-use commented 7 years ago

Sorry, I tend to confuse people. I'm saying:

  1. I have a brand new ESP-12E module fresh from the manufacturer. It is perfectly fine and has (by factory) the AT firmware on it
  2. I flash the latest version of esp82xx to the module
  3. After step 2), the module is caught in an endless loop of resets and the UART goes crazy. It is now what I call "faulty".
  4. I flash an older version of esp82xx to the "faulty" module and set it up to connect to my home's WiFi via the WebGUI.
  5. The formally "faulty" module now works perfectly with either the old firmware version, i.e. the one flashed in step 4), or (if I flash it with the latest esp82xx) the new firmware version. The bug thusly forever disappears and is never to be seen with that module.

Above process was consistent with four modules from different sellers using different serial programmers. No module is unfixably fubar. However, this bug might throw off newbies, since NEW modules that never had an old firmware on them don't work. At least not "out of the box" using the latest esp82xx. I have no idea what causes the bug, but suspect, it might have something to do with setting the WiFi in step 4.)

I have no module to chase the bug with, since all my modules went through step 4. If you encounter the bug, and do testing, make sure to keep a "faulty" module around.

cnlohr commented 7 years ago

I just pushed to dev a fix that should help with this. I think something may have been going on fishy with the init data. I'm not sure. Make sure you can erase chip and install the init data.

Also, should I not have pushed that to dev?

I am a little confused as to how main and dev interact.

cnlohr commented 7 years ago

Oooohh nooooo I've been using and flashing the master branch this whole time and never the dev branch!!! That could explain why I haven't seen it.

So here's what I want you to do. Can you dump the FULL flash off a fresh-from-factory one?

esptool.py -b 1000000 read_flash 0 4194304 factory.bin

(I want to see the /full/ rom not just the first 1M)

It'll take a bit, even at 1,000,000 baud.

cnlohr commented 7 years ago

Everything's broken again. In my project where I have esp82xx, I can't get the newest dev, and if I try git pull origin dev, I get this

From https://github.com/cnlohr/esp82xx
 * branch            dev        -> FETCH_HEAD
Auto-merging user.cfg.example
CONFLICT (content): Merge conflict in user.cfg.example
Auto-merging main.mf
CONFLICT (content): Merge conflict in main.mf
Auto-merging fwsrc/user/user_main.c
CONFLICT (add/add): Merge conflict in fwsrc/user/user_main.c
Auto-merging README.md
CONFLICT (content): Merge conflict in README.md
Automatic merge failed; fix conflicts and then commit the result.

I promise I am trying. Man, git is hard. And with multiple mainline branches it's even harder :(.

cnlohr commented 7 years ago

It's all busted, no matter what I revert, git checkout origin dev just hoses the tree.

cnlohr commented 7 years ago

I neeeeddd to sleeeeepppp but it's sooooo broken

con-f-use commented 7 years ago

Okay so I cleaned up. Just reverted all your changes save the last one and leveled the master and dev branches.

You will need a fresh clone of the dev branch, if you want to make further changes:

git clone --recursive https://github.com/cnlohr/esp82xx.git -b dev

I think dev is the place to experiment and fix. Once dev is working, we squash-merge into master.

cnlohr commented 7 years ago

Are you okay with those changes I made? I.e. the moving around?

Did you get a chance to dump a fresh build?

Have you tried erase and the init flash?

Charles

con-f-use commented 7 years ago

The moving around does wonders. user_main.c should be as clean as possible because I'm easily confused. :-D

Purely erasing the flash and then flashing the firmware does nothing to fix the bug. Will try the init-data some time this week and post a flash dump but can't right now.

cnlohr commented 7 years ago

So, the tip-off to me was the following:

If you erase the chip and try running it, no matter what you do, it will always get stuck in a reboot loop. At least until you flash init data.

That leads me to believe that the issues may be related. Perhaps the init data somehow changed between SDK versions?

con-f-use commented 7 years ago

Okay init-data seems to be a component. Still having some reboot problems after flashing the init-data but at least I see something on the UART and resetting is less frequent. Might also be, that my board is wonky (on top of things). Will need further testing before closing the issue. Just wanted to mention it.

Btw. what's the difference between initdefault and init3v3?

cnlohr commented 7 years ago

There's so many things that can go wrong. I am glad that overall, these systems are slowly paying down their technical debt :)

Glad you asked!

One lets you use the external ADC as a regular ADC. The other lets you monitor the voltage of the 3.3v rail. It's really useful for things like con badges, etc.

cnlohr commented 7 years ago

I see you closed it but without comment? Did you determine if this was indeed the fix? Erase-and-initdata?

con-f-use commented 7 years ago

Oops, no must have hit the button by accident. Still waiting for my new boards and modules to come in.

cnlohr commented 7 years ago

gotcha

cnlohr commented 7 years ago

So, just to point it out. I got a chip that was stuck in constant reboot land. Did the erase and initdefault and it fixed it.

con-f-use commented 7 years ago

I got my new boards, chips and modules. Everything werks ze fine after flashing initdefault! 'tis fixed.