me-no-dev / ESPAsyncWebServer

Async Web Server for ESP8266 and ESP32
3.67k stars 1.21k forks source link

async server shows crash with heap4c #7

Closed andig closed 8 years ago

andig commented 8 years ago

Trying to track down some crashes I've implemented heap4c and a simple sketch:

#include <ESP8266WiFi.h>
#include <ESPAsyncWebServer.h>

AsyncWebServer g_server(80);

void setup() {
  Serial.begin(115200);
  WiFi.begin();
  while (WiFi.status() != WL_CONNECTED)
    delay(100);
  g_server.begin();
}

void loop() {
  delay(100);
}

When putting a little load, this will crash pretty immediately if heap4c present and run fine for ages (as does a more complex sketch) as long as ESPAsyncWebServer is not started. As you can see there are no handlers attached, start of ESPAsyncWebServer alone is enough to cause the crash.

Use this for load:

while (true); do curl -m 1 http://192.168.0.30; done

Crash (update- deleted everything before wifi is up):

ip:192.168.0.30,mask:255.255.255.0,gw:192.168.0.1f 112 153 <
m 5 64 154 <

f 64 154 <
m 16 64 155 <
f 64 149 <
m 120 168 156 <
m 16 64 157 <
f 96 152 <
f 152 151 <
f 168 156 <
f 64 157 <
m 16 64 158 <
f 64 158 <
m 16 64 159 <
f 64 159 <
m 16 64 160 <
f 64 160 <
m 16 64 161 <
f 64 155 <
m 16 64 162 <
f 64 161 <
m 16 64 163 <
m 20 72 164 <
f 72 164 <
f 64 163 <
m 16 64 165 <
f 64 165 <
m 16 64 166 <
f 64 166 <
m 16 64 167 <
f 64 167 <
m 16 64 168 <
f 64 150 <
m 16 64 169 <
f 64 162 <
m 16 64 170 <
f 64 168 <
m 16 64 171 <
m 20 72 172 <
f 72 172 <
m 20 72 173 <
f 72 173 <
f 64 171 <
m 16 64 174 <
f 64 174 <
m 16 64 175 <
f 64 175 <
m 16 64 176 <
f 64 176 <
m 16 64 177 <
m 20 72 178 <
f 72 178 <
f 64 170 <
m 16 64 179 <
f 64 177 <
m 16 64 180 <
f 64 180 <
m 16 64 181 <
f 64 181 <
m 16 64 182 <
f 64 182 <
m 16 64 183 <
f 64 183 <
m 16 64 184 <
f 64 1 <
m 16 64 185 <
f 64 169 <
m 16 64 186 <
f 64 179 <
m 16 64 187 <
f 64 184 <
m 16 64 188 <
m 20 72 189 <
m 176 224 190 <
m 16 64 191 <
m 116 168 192 <
m 20 72 193 <
m 100 152 194 <
m 116 168 195 <
m 8 64 196 <
f 72 189 <
f 152 194 <
f 64 188 <
m 16 64 197 <
m 20 72 198 <
f 64 196 <
f 72 198 <
f 168 195 <
f 64 191 <
m 16 64 199 <
m 20 72 200 <
m 176 224 201 <
m 220 272 202 <
m 4 64 203 <
m 16 64 204 <
m 16 64 205 <
m 16 64 206 <
m 16 64 207 <
m 16 64 208 <
m 16 64 209 <
m 16 64 210 <
m 16 64 211 <
m 16 64 212 <
m 16 64 213 <
m 4 56 214 <
m 4 56 215 <
f 56 214 <
m 4 56 216 <
m 4 56 217 <
f 56 216 <
m 4 56 218 <
m 4 56 219 <
f 56 218 <
m 4 56 220 <
m 4 56 221 <
f 56 220 <
m 4 56 222 <
m 4 56 223 <
f 56 222 <
m 4 56 224 <
m 4 56 225 <
f 56 224 <
f 168 192 <
f 72 193 <
f 72 200 <
m 20 72 226 <
m 32 80 227 <
f 64 204 <
m 16 88 228 <
m 16 72 229 <
m 32 80 230 <
f 72 229 <
f 80 230 <
m 16 72 231 <
m 16 64 232 <
m 16 64 233 <
f 64 233 <
f 64 232 <
f 72 231 <
f 88 228 <
m 32 88 234 <
m 28 80 235 <
m 16 72 236 <
m 16 64 237 <
m 16 64 238 <
f 64 238 <
m 16 64 239 <
f 64 239 <
f 88 234 <
m 16 88 240 <
f 88 240 <
m 16 88 241 <
f 88 241 <
f 64 237 <
f 72 236 <
f 80 235 <
m 16 88 242 <
f 88 242 <
m 48 96 243 <
f 80 227 <
m 64 112 244 <
f 96 243 <
m 80 128 245 <
f 112 244 <
m 96 168 246 <
f 128 245 <
m 96 144 247 <
m 28 80 248 <
m 16 72 249 <
m 16 64 250 <
m 16 64 251 <
f 64 251 <
m 16 64 252 <
m 80 128 253 <
f 64 252 <
f 64 250 <
f 144 247 <
m 16 64 254 <
f 64 254 <
m 16 64 255 <
f 64 255 <
m 16 64 256 <
f 64 256 <
m 16 64 257 <
f 64 257 <
m 16 64 258 <
f 64 258 <
m 16 64 259 <
m 16 64 260 <
f 64 260 <
f 64 259 <
f 128 253 <
f 72 249 <
f 80 248 <
m 16 72 261 <
f 72 261 <
m 16 72 262 <
m 28 80 263 <
m 16 64 264 <
m 16 64 265 <
m 16 64 266 <
f 64 266 <
m 16 64 267 <
f 64 267 <
f 72 262 <
m 16 72 268 <
f 72 268 <
m 16 72 269 <
f 72 269 <
m 16 72 270 <
f 72 270 <
m 16 72 271 <
f 72 271 <
m 16 72 272 <
f 72 272 <
m 16 72 273 <
m 16 64 274 <
f 64 274 <
f 72 273 <
f 64 265 <
f 64 264 <
f 80 263 <
m 16 72 275 <
f 72 275 <
m 48 96 276 <
m 28 80 277 <
m 16 72 278 <
m 16 64 279 <
m 16 64 280 <
f 64 280 <
m 16 64 281 <
m 32 80 282 <
f 64 281 <
f 64 279 <
f 96 276 <
m 16 64 283 <
f 64 283 <
m 16 64 284 <
f 64 284 <
m 16 64 285 <
f 64 285 <
m 16 64 286 <
f 64 286 <
m 16 64 287 <
f 64 287 <
m 16 64 288 <
m 16 96 289 <
f 96 289 <
f 64 288 <
f 80 282 <
f 72 278 <
f 80 277 <
m 16 72 290 <
f 72 290 <
m 32 80 291 <
m 28 80 292 <
m 16 72 293 <
m 16 64 294 <
m 16 64 295 <
f 64 295 <
m 16 64 296 <
f 64 296 <
f 80 291 <
m 16 80 297 <
f 80 297 <
m 16 80 298 <
f 80 298 <
m 16 80 299 <
f 80 299 <
m 16 80 300 <
f 80 300 <
m 16 80 301 <
f 80 301 <
m 16 80 302 <
m 16 64 303 <
f 64 303 <
f 80 302 <
f 64 294 <
f 72 293 <
f 80 292 <
m 16 72 304 <
f 72 304 <
m 48 96 305 <
m 28 80 306 <
m 16 72 307 <
m 16 64 308 <
m 16 64 309 <
f 64 309 <
m 16 64 310 <
m 48 96 311 <
f 64 310 <
f 64 308 <
f 96 305 <
m 16 64 312 <
f 64 312 <
m 16 64 313 <
f 64 313 <
m 16 64 314 <
f 64 314 <
m 16 64 315 <
f 64 315 <
m 16 64 316 <
f 64 316 <
m 16 64 317 <
m 16 96 318 <
f 96 318 <
f 64 317 <
f 96 311 <
f 72 307 <
f 80 306 <
m 16 72 319 <
f 72 319 <
m 32 80 320 <
m 28 80 321 <
m 16 72 322 <
m 16 64 323 <
m 16 64 324 <
f 64 324 <
m 16 64 325 <
m 32 80 326 <
f 64 325 <
f 64 323 <
f 80 320 <
m 16 64 327 <
f 64 327 <
m 16 64 328 <
f 64 328 <
m 16 64 329 <
f 64 329 <
m 16 64 330 <
f 64 330 <
m 16 64 331 <
f 64 331 <
m 16 64 332 <
m 16 80 333 <
f 80 333 <
f 64 332 <
f 80 326 <
f 72 322 <
f 80 321 <
m 16 72 334 <
f 72 334 <
m 32 80 335 <
m 28 80 336 <
m 16 72 337 <
m 16 64 338 <
m 16 64 339 <
f 64 339 <
m 16 64 340 <
f 64 340 <
f 80 335 <
m 16 80 341 <
f 80 341 <
m 16 80 342 <
f 80 342 <
m 16 80 343 <
f 80 343 <
m 16 80 344 <
f 80 344 <
m 16 80 345 <
f 80 345 <
m 16 80 346 <
m 16 64 347 <
f 64 347 <
f 80 346 <
f 64 338 <
f 72 337 <
f 80 336 <
m 16 72 348 <
f 72 348 <
m 16 72 349 <
m 16 64 350 <
m 16 64 351 <
m 16 64 352 <
m 56 104 353 <
m 16 64 354 <
m 16 64 355 <
m 16 64 356 <
m 16 64 357 <
m 16 64 358 <
m 28 80 359 <
m 16 64 360 <
m 16 64 361 <
f 64 358 <
f 64 357 <
f 64 356 <
f 64 355 <
m 32 80 362 <
m 16 64 363 <
m 32 112 364 <
m 16 64 365 <
m 28 80 366 <
m 32 80 367 <
m 16 64 368 <
f 64 365 <
f 112 364 <
f 64 363 <
f 80 362 <
m 16 64 369 <
m 16 64 370 <
m 16 64 371 <
m 48 96 372 <
f 64 370 <
m 48 96 373 <
f 64 371 <
f 96 372 <
m 32 80 374 <
m 16 64 375 <
m 64 112 376 <
f 96 373 <
f 64 375 <
f 80 374 <
m 16 64 377 <
m 32 80 378 <
f 64 377 <
m 32 80 379 <
f 80 378 <
m 80 128 380 <
f 112 376 <
f 80 379 <
f 64 361 <
f 64 360 <
f 80 359 <
m 32 80 381 <
m 48 96 382 <
f 80 381 <
m 48 96 383 <
f 96 382 <
m 112 176 384 <
f 128 380 <
f 96 383 <
f 64 368 <
f 80 367 <
f 80 366 <
m 1572 1624 385 <
m 20 72 386 <
f 176 384 <
f 64 352 <
f 64 351 <
f 64 350 <
f 72 349 <
f 72 226 <
m 20 72 387 <
f 64 197 <
m 16 64 388 <
f 64 199 <
m 16 64 389 <
f 64 388 <
m 16 64 390 <
f 64 389 <
m 16 64 391 <
f 64 390 <
m 16 64 392 <
f 64 187 <
m 16 64 393 <
f 64 391 <
m 16 64 394 <
f 64 392 <
m 16 64 395 <
f 72 387 <
m 20 72 396 <
f 1624 385 <
f 72 386 <
m 112 160 397 <
m 20 72 398 <
f 160 397 <
f 72 398 <
m 112 160 399 <
f 224 190 <
f 56 225 <
f 56 221 <
f 56 223 <
f 56 215 <
f 56 217 <
f 56 219 <
f 224 201 <
f 64 203 <
f 64 369 <
f 64 354 <
f 104 353 <
f 64 213 <
f 64 212 <
f 64 211 <
f 64 210 <
f 64 209 <
f 64 208 <
f 64 207 <
f 64 206 <
f 64 205 <
f 168 246 <
f 272 202 <
m 21 72 400 <
Fatal exception 28f 72 400 <
m 25 80 401 <
(LoadProhibitedCause):
f 80 401 <
m 69 120 402 <
epc1=0x4021dec7, epc2=0x00000000, epc3=0x00000000, excvaddr=0x007e8000, depc=0x00000000
f 120 402 <

Exception (28):
epc1=0x4021dec7 epc2=0x00000000 epc3=0x00000000 excvaddr=0x007e8000 depc=0x00000000

ctx: sys 
sp: 3ffffd70 end: 3fffffb0 offset: 01a0

>>>stack>>>
3fffff10:  4021bee2 3fff0d20 3fff1358 3ffee820  
3fffff20:  3ffee844 4021baf1 3fff0d20 00000000  
3fffff30:  3ffee844 3ffee820 0000cccc 4021bae0  
3fffff40:  69b13f15 000019dc 00000001 00000011  
3fffff50:  00000000 00000000 4021a8f6 3fff0cd8  
3fffff60:  3fff0b98 3ffe9be6 3fff0b98 4021968b  
3fffff70:  3fff0b98 00000014 40219c36 3fff0cd8  
3fffff80:  3fff0b98 3fffdc80 3fff0c38 00000001  
3fffff90:  402255ef 3fff0cd8 00000000 40205bdb  
3fffffa0:  40000f49 3fffdab0 3fffdab0 40000f49  
<<<stack<<<

 ets Jan  8 2013,rst cause:1, boot mode:(1,7)

 ets Jan  8 2013,rst cause:4, boot mode:(1,7)

wdt reset
me-no-dev commented 8 years ago

so you are testing how the server performs when there are no handlers attached? Do you wait for the WiFi to start? Also can you please use addr2line when such exception occurs and get where it happened? Interesting addresses above are: 0x4021dec7 4021bee2 4021baf1

andig commented 8 years ago

No success (new dump, new addresses due to added wifi connect wait):

C:\andi\arduino\hardware\eps8266com\esp8266\tools\xtensa-lx106-elf\bin>xtensa-lx106-elf-addr2line -e C:\Users\xx\AppData\Local\Temp\build662bf6597004a59e628e0f8ceba459f9.tmp\async.ino.elf 0x4021befa 0x4021bb09 0x4021dedf 0x4021baf8
??:?
??:?
??:?
??:?

addr2line needs be run on the .elf, right?

andig commented 8 years ago

so you are testing how the server performs when there are no handlers attached?

I'm trying to build a minimal test case for the crashes I've seen. First step was adding heap4c which made crashes almost immediate. I assumed heap4c itself is not the reason so I started to dig in further.

After removing all my code and then all handlers it still crashed if there's http requests coming in. I'm suspecting the async server itself (no better idea with the minimal sketch above..)

Crash only happens if server is busy. No requests, no crash.

me-no-dev commented 8 years ago

just do

xtensa-lx106-elf-addr2line C:\Users\xx\AppData\Local\Temp\build662bf6597004a59e628e0f8ceba459f9.tmp\async.ino.elf

then enter the addresses (they change with the code that's fine) just look for the addresses starting with 402 in the dump (some 401 might show as well)

me-no-dev commented 8 years ago

the server should return 500 on request. How often does your curl command send requests?

me-no-dev commented 8 years ago

does it wait for them to finish before starting a new one?

andig commented 8 years ago

just do

error: xtensa-lx106-elf-addr2line: 'a.out': No such file. I need the -e

then enter the addresses (they change with the code that's fine) just look for the addresses starting with 402 in the dump (some 401 might show as well)

Same result. Just ??.

the server should return 500 on request. How often does your curl command send requests? Does it wait for them to finish before starting a new one?

Even a single request is enough to crash if heap4c is compiled in.

me-no-dev commented 8 years ago

and what if not using heap4c? and you were correct. I actually execute xtensa-lx106-elf-addr2line -aipfC -e [elf]

andig commented 8 years ago

and what if not using heap4c?

no crash, runs for ages. But heap4c does heap poisoning as I've understood.

Unfortunely I've not yet managed to get gdb running on windows- no bundled version in the SDK :(

andig commented 8 years ago

I've gone back to an old version of my sketch with sync server + heap4c. No crashes.

andig commented 8 years ago

For what it's worth: naked AsyncServer with heap4c doesn't crash. AyncWebServer does.

me-no-dev commented 8 years ago

maybe stop printing from heap4 ? and you need to understand that I can not support heap4 nor alter it's code as we are not supposed to use it at all

me-no-dev commented 8 years ago

you are printing from interrupt using HardwareSerial that uses interrupts. Many shit can go wrong the other one you print from the loop And why test naked server? Who will ever use a Web server without a handler? There is TCP server for that

andig commented 8 years ago

And why test naked server? Who will ever use a Web server without a handler? There is TCP server for that

Nobody. Trying to limit the number of variables. Don't have better ideas without gdb :(

you are printing from interrupt using HardwareSerial that uses interrupts. Many shit can go wrong the other one you print from the loop

ok. I could test get rid of printing and leave the lib in.

andig commented 8 years ago

btw, this would need update for latest git https://github.com/me-no-dev/ESPAsyncWebServer/blob/8fed4c42f34a4693b6529f051fbb6840d886a1b4/src/WebResponses.cpp#L362

andig commented 8 years ago

ok. I could test get rid of printing and leave the lib.

Still crashes :(

me-no-dev commented 8 years ago

I'm trying to write a tool to decode the dumps for you (through the IDE tools). Maybe we'll get a better idea of what and why it happens. But since it does not crash with the regular memory management, I doubt it's the server's fault really... I checked the code and it all goes as planned even if no handlers are attached. As I said, will return 500 :) I updated the git for the changes in the core ;)

me-no-dev commented 8 years ago

so I tested your sketch with umm_malloc and have no issues either :) if you want to give it a shot, my git fork of the ESP repo is running it

andig commented 8 years ago

so I tested your sketch with umm_malloc and have no issues either :)

so its working with heap4 and working with umm_alloc, both with traffic, right?

if you want to give it a shot, my git fork of the ESP repo is running it

the 31b or the arduino? main branch?

I'm wondering what other differences there could be. Maybe the toolchain on Windows has problems?

me-no-dev commented 8 years ago

my ESP8266 Arduino branch is running on umm_malloc. I did not try heap4 to be honest. Had already deleted the contents of the file :)

andig commented 8 years ago

Ok. Crash with heap4 is this:

0x4021dfe7: tcp_output at ??:?
0x4021dfe7: tcp_output at ??:?
0x4021c002: tcp_input at ??:?
0x4021bc11: tcp_input at ??:?
0x4021bc00: tcp_input at ??:?
0x4021aa16: ip_input at ??:?
0x402197ab: etharp_find_addr at ??:?
0x40219d56: ethernet_input at ??:?
0x4022570f: ets_snprintf at ??:?
0x40205cf3: loop_task at C:\andi\arduino\hardware\eps8266com\esp8266\cores\esp8266/core_esp8266_main.cpp:43
0x40000f49: ?? ??:0
0x40000f49: ?? ??:0

As it does not crash with your umm* branch I'm closing as invalid. Thanks to the tool I can go back to the original sketch now and see how that behaves...