Doodle3D / doodle3d-client

The Doodle3D web interface that people can access on computers or tablets to draw doodles and control the Doodle3D WiFi box.
www.doodle3d.com
GNU General Public License v2.0
3 stars 8 forks source link

When printing the Doodle3D WiFi-Box sometimes crashes #280

Open casperlamboo opened 9 years ago

casperlamboo commented 9 years ago

When printing the Doodle3D WiFi-Box sometimes crashes. I'm printing with a custom app to send gcode to the WiFi-box. The code is build so there is no more than a maximum of 4096 lines of code buffered at the WiFi-box at once. And there are no simultaneous API calls; the state doesn't update while sending gcode. When printing I get these symptoms

There seems to no correlation between the size of the print and the frequency of the crashes. The Doodle3D box only seems to crash while printing, not while updating state.

peteruithoven commented 8 years ago

Probably partially related to: https://github.com/Doodle3D/doodle3d-firmware/issues/11

companje commented 8 years ago

It seems print3d is crashing. The last word in the logfile (print3d-ttyACM0.log) is Killed.

30-10 16:43:18  [MLD] append(): added 990 bytes of gcode (1 chunks) in 3 ms
30-10 16:43:18  [CDH] received append gcode command with argument length 299
30-10 16:43:18  [MLD] append(): added 299 bytes of gcode (1 chunks) in 1 ms
30-10 16:43:19  [CDH] received get state command
30-10 16:43:19  [CDH] received get progress command
30-10 16:43:19  [CDH] received append gcode command with argument length 990
30-10 16:43:19  [MLD] append(): added 990 bytes of gcode (1 chunks) in 3 ms
30-10 16:43:19  [CDH] received append gcode command with argument length 990
30-10 16:43:19  [MLD] append(): added 990 bytes of gcode (1 chunks) in 3 ms
30-10 16:43:19  [CDH] received append gcode command with argument length 1009
30-10 16:43:19  [MLD] append(): added 1009 bytes of gcode (1 chunks) in 3 ms
30-10 16:43:19  [CDH] received append gcode command with argument length 983
30-10 16:43:19  [MLD] append(): added 983 bytes of gcode (1 chunks) in 3 ms
30-10 16:43:19  [CDH] received append gcode command with argument length 1007
30-10 16:43:19  [MLD] append(): added 1007 bytes of gcode (1 chunks) in 3 ms
30-10 16:43:19  [CDH] received append gcode command with argument length 985
30-10 16:43:19  [MLD] append(): added 985 bytes of gcode (1 chunks) in 3 ms
30-10 16:43:19  [CDH] received append gcode command with argument length 985
30-10 16:43:19  [MLD] append(): added 985 bytes of gcode (1 chunks) in 3 ms
30-10 16:43:19  [CDH] received append gcode command with argument length 994
30-10 16:43:19  [MLD] append(): added 994 bytes of gcode (1 chunks) in 3 ms
30-10 16:43:19  [CDH] received append gcode command with argument length 998
30-10 16:43:19  [MLD] append(): added 998 bytes of gcode (1 chunks) in 3 ms
30-10 16:43:19  [CDH] received append gcode command with argument length 1006
30-10 16:43:19  [MLD] append(): added 1006 bytes of gcode (1 chunks) in 3 ms
30-10 16:43:19  [CDH] received append gcode command with argument length 996
30-10 16:43:19  [MLD] append(): added 996 bytes of gcode (1 chunks) in 4 ms
30-10 16:43:19  [CDH] received append gcode command with argument length 996
30-10 16:43:19  [MLD] append(): added 996 bytes of gcode (1 chunks) in 4 ms
30-10 16:43:19  [CDH] received append gcode command with argument length 1009
30-10 16:43:19  [MLD] append(): added 1009 bytes of gcode (1 chunks) in 16 ms
30-10 16:43:19  [CDH] received append gcode command with argument length 994
30-10 16:43:19  [MLD] append(): added 994 bytes of gcode (1 chunks) in 4 ms
Killed
companje commented 8 years ago

Logread gives some useful information:

Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.220000] print3d invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.220000] CPU: 0 PID: 1298 Comm: print3d Not tainted 3.10.49 #4
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.230000] Stack : 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.230000]     00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.230000]     00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.230000]     00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.230000]     00000000 00000000 00000000 00000000 00000000 00000000 00000000 80ebdb20
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.230000]     ...
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.260000] Call Trace:[<80203010>] 0x80203010
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.270000] [<80203010>] 0x80203010
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.270000] [<80255454>] 0x80255454
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.270000] [<802056e8>] 0x802056e8
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.280000] [<801a2d70>] 0x801a2d70
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.280000] [<801a3e70>] 0x801a3e70
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.280000] [<80081b44>] 0x80081b44
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.290000] [<80120140>] 0x80120140
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.290000] [<80211030>] 0x80211030
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.300000] [<8008a100>] 0x8008a100
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.300000] [<80138ff0>] 0x80138ff0
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.300000] [<800dcd10>] 0x800dcd10
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.310000] [<802434e8>] 0x802434e8
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.310000] [<80138dd0>] 0x80138dd0
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.310000] [<801041b8>] 0x801041b8
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.320000] [<8008ce84>] 0x8008ce84
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.320000] [<802434e8>] 0x802434e8
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.320000] [<801bd710>] 0x801bd710
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.330000] [<801d1df8>] 0x801d1df8
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.330000] [<80067ed0>] 0x80067ed0
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.330000] [<80077df8>] 0x80077df8
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.340000] [<80073168>] 0x80073168
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.340000] [<80060820>] 0x80060820
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.340000]
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.350000] Mem-Info:
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.350000] Normal per-cpu:
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.350000] CPU    0: hi:    0, btch:   1 usd:   0
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.360000] active_anon:4157 inactive_anon:323 isolated_anon:0
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.360000]  active_file:31 inactive_file:58 isolated_file:0
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.360000]  unevictable:0 dirty:0 writeback:0 unstable:0
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.360000]  free:170 slab_reclaimable:252 slab_unreclaimable:861
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.360000]  mapped:1 shmem:462 pagetables:74 bounce:0
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.360000]  free_cma:0
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.390000] Normal free:680kB min:680kB low:848kB high:1020kB active_anon:16628kB inactive_anon:1292kB active_file:124kB inactive_file:232kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:32768kB managed:29172kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:1848kB slab_reclaimable:1008kB slab_unreclaimable:3444kB kernel_stack:320kB pagetables:296kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:548 all_unreclaimable? Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.430000] lowmem_reserve[]: 0 0
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.430000] Normal: 12*4kB (R) 3*8kB (MR) 8*16kB (MR) 1*32kB (R) 1*64kB (R) 1*128kB (R) 1*256kB (R) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 680kB
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.440000] 551 total pagecache pages
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.450000] 0 pages in swap cache
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.450000] Swap cache stats: add 0, delete 0, find 0/0
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.450000] Free swap  = 0kB
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.460000] Total swap = 0kB
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.460000] 8192 pages RAM
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.460000] 837 pages reserved
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.470000] 263903 pages shared
Fri Oct 30 16:43:37 2015 kern.warn kernel: [  488.470000] 6807 pages non-shared
Fri Oct 30 16:43:37 2015 kern.info kernel: [  488.470000] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
Fri Oct 30 16:43:37 2015 kern.info kernel: [  488.480000] [  372]     0   372      179       15       4        0             0 ubusd
Fri Oct 30 16:43:37 2015 kern.info kernel: [  488.490000] [  373]     0   373      152       12       3        0             0 askfirst
Fri Oct 30 16:43:37 2015 kern.info kernel: [  488.500000] [  629]     0   629      230       48       4        0             0 logd
Fri Oct 30 16:43:37 2015 kern.info kernel: [  488.500000] [  662]     0   662      345       53       4        0             0 netifd
Fri Oct 30 16:43:37 2015 kern.info kernel: [  488.510000] [  716]     0   716      247       17       5        0             0 dropbear
Fri Oct 30 16:43:37 2015 kern.info kernel: [  488.520000] [  747]     0   747      579      139       4        0             0 uhttpd
Fri Oct 30 16:43:37 2015 kern.info kernel: [  488.530000] [  900]     0   900      327       27       3        0             0 wpa_supplicant
Fri Oct 30 16:43:37 2015 kern.info kernel: [  488.540000] [  967]     0   967      287       16       4        0             0 udhcpc
Fri Oct 30 16:43:37 2015 kern.info kernel: [  488.550000] [ 1187]     0  1187      286       14       4        0             0 signin.sh
Fri Oct 30 16:43:37 2015 kern.info kernel: [  488.550000] [ 1189]     0  1189      285       12       3        0             0 sleep
Fri Oct 30 16:43:37 2015 kern.info kernel: [  488.560000] [ 1209]     0  1209      288       17       3        0             0 ntpd
Fri Oct 30 16:43:37 2015 kern.info kernel: [  488.570000] [ 1246] 65534  1246      189       23       5        0             0 dnsmasq
Fri Oct 30 16:43:37 2015 kern.info kernel: [  488.580000] [ 1252]     0  1252      286       14       3        0             0 print3d-manager
Fri Oct 30 16:43:37 2015 kern.info kernel: [  488.590000] [ 1258]     0  1258      285       13       3        0             0 inotifyd
Fri Oct 30 16:43:37 2015 kern.info kernel: [  488.590000] [ 1282]     0  1282      287       15       4        0             0 print3d-runner.
Fri Oct 30 16:43:37 2015 kern.info kernel: [  488.600000] [ 1298]     0  1298     3171     2875       6        0             0 print3d
Fri Oct 30 16:43:37 2015 kern.info kernel: [  488.610000] [ 1827]     0  1827     1022      565       4        0             0 uhttpd
Fri Oct 30 16:43:37 2015 kern.info kernel: [  488.620000] [ 1836]     0  1836      581      142       3        0             0 uhttpd
Fri Oct 30 16:43:37 2015 kern.err kernel: [  488.630000] Out of memory: Kill process 1298 (print3d) score 380 or sacrifice child
Fri Oct 30 16:43:37 2015 kern.err kernel: [  488.640000] Killed process 1298 (print3d) total-vm:12684kB, anon-rss:11500kB, file-rss:0kB
companje commented 8 years ago

print3d uses 3171 (MB?) of total_vm and uhttpd uses 1022.

Fri Oct 30 16:43:37 2015 kern.info kernel: [  488.600000] [ 1298]     0  1298     3171     2875       6        0             0 print3d
Fri Oct 30 16:43:37 2015 kern.info kernel: [  488.610000] [ 1827]     0  1827     1022      565       4        0             0 uhttpd

We might want to check if lowering the GCODE_BUFFER_MAX_SIZE_KB in print3d helps: https://github.com/Doodle3D/print3d/blob/master/src/drivers/GCodeBuffer.cpp#L21

Hmm... there's an interesting comment in the code: const uint32_t GCodeBuffer::MAX_BUFFER_SIZE = 1024 * GCODE_BUFFER_MAX_SIZE_KB; //set to 0 to disable (TODO: actually, this isn used at all currently)

We need to check the source of print3d to see if this has been implemented.

companje commented 8 years ago

@casperlamboo Did print3d also crash when you compiled for Mac OSX?

peteruithoven commented 8 years ago

I would like to figure out what the available RAM was when Print3d was killed. Whether print3d specifically was using more memory than allowed per process or the device ran completely out of memory.

woutgg commented 8 years ago

As for the maximum buffer size, that has been implemented in the print3d branch mb-gcodebuffer. This implements several consistency checks and max buffer size etc in GCodeBuffer. It also uses the buffer class for makerbot gcode handling, solving makerbot-specific issues. See branch commits for details - I could make a detailed summary if that is helpful.

I thought this had been integrated back then, but apparently not. There should also be a client branch which is able to deal with the extended api (handling buffer-full messages etc), probably unify-logging+chunk-checking.

peteruithoven commented 8 years ago

Also see: https://github.com/Doodle3D/print3d/issues/21

peteruithoven commented 8 years ago

To limit the logging we've made doodle3d-firmware and print3d log far less by default. https://github.com/Doodle3D/doodle3d-firmware/commit/f51e63e06a536000f6566d170cac794970708618 https://github.com/Doodle3D/print3d/commit/0280dd5c36eec7bf5a7cf3872c1b09df388a20fc This should at least prevent the type of crashed mentioned above.