ttrftech / NanoVNA

Very Tiny Palmtop Vector Network Analyzer
1.06k stars 296 forks source link

Fix crash on trace command, reduce flash usage by 8k, up to 2x faster screen render #121

Closed DiSlord closed 4 years ago

DiSlord commented 4 years ago

fix: if used as shell, and run some commands get stack limit, example "trace 0 x" command) Decrease interrupt stack size

Use __ROR instruction in flash.c for checksum rotate (also remove tabs)

Fix erase background for frequencies string in plot.c

main.c Implement getStringIndex function for parse string arguments, now Usage show correct information about used arg, and more easy use strings definitions Example: Need check if string "on" in avaible arguments list "load|open|short|thru|isoln|done|on|off|reset|data|in" getStringIndex("on", "load|open|short|thru|isoln|done|on|off|reset|data|in") return 6 If not found return -1 Not need use if (strcmp() == ...) else .... This usage save some amount of size, and do less error

DiSlord commented 4 years ago

Made screen render profile, found interesting things: Usage memset(spi_buffer, DEFAULT_BG_COLOR, (h*CELLWIDTH)*sizeof(uint16_t)); dramatically decrease render speed possibly it fill buffer by 8 bit data, so slow Usage

  uint32_t *p = (uint32_t *)spi_buffer;
  while (count--) {
    p[0] = DEFAULT_BG_COLOR|(DEFAULT_BG_COLOR<<16);
    p[1] = DEFAULT_BG_COLOR|(DEFAULT_BG_COLOR<<16);
    p[2] = DEFAULT_BG_COLOR|(DEFAULT_BG_COLOR<<16);
    p[3] = DEFAULT_BG_COLOR|(DEFAULT_BG_COLOR<<16);
    p+=4;
  }

gives x10 speed perfomance

Draw polar and smit grid very slow (but i don`t know how increase it except use bitmaps, but it need about 5-8k flash size and file prepare)

In most cases marked cells render need 260-400 system tick (500-700 before) on rectangular plot (760 on full screen render, 1100-1300 before)

Sweep need 8400 system tick

DiSlord commented 4 years ago

Made sweep function profie (in system tick):

  for (i = 0; i < sweep_points; i++) {         //  for all cycle 8365
    int delay = set_frequency(frequencies[i]); // 1560
    tlv320aic3204_select(0);                   // 60 CH0:REFLECT

    wait_dsp(delay);                           // 3270
    // calculate reflection coefficient
    (*sample_func)(measured[0][i]);            // 60

    tlv320aic3204_select(1);                   // 60 CH1:TRANSMISSION
    wait_dsp(DELAY_CHANNEL_CHANGE);            // 2700
    // calculate transmission coefficient
    (*sample_func)(measured[1][i]);            // 60
                                               // ======== 170 ===========
    if (cal_status & CALSTAT_APPLY)
      apply_error_term_at(i);
    if (electrical_delay != 0)
      apply_edelay_at(i);

    // back to toplevel to handle ui operation
    if (operation_requested && break_on_operation)
      return false;
  }
set_frequency(frequencies[i]);  // 1560
wait_dsp();                                // both  use 6000cycles, from 8400 total.....
screen render                           // 200-1500

At result: 15% set frequency 60% of time processor just wait data result 10%render data 10%others calculation

Calculation coefficient need just 170 tick (very fast) apply_error_term_at(i); apply_edelay_at(i);

PS: Possible not do chanel data collect if this channel not in traces (but need enable on command and calibration) it give 30% speedup on sweep In most cases better write more compact code (it not give huge speedup) And visually fast Screen update look better

More faster processor not give good speedup :)

DiSlord commented 4 years ago

Implement stack use check in "threads" command, now free stack space show in table as "stk free" in hex Check stack usage by sweep, and main threads (seems all ok, but add 64 bytes to sweep)

Now more easy check stack usage

DiSlord commented 4 years ago

Try use USE_LTO = yes It allow compile 79164 5084 11488 95736 175f8 build/ch.elf Vs this option, firmware not stable, used more stack. Possible problem in non volatile global variables optimised as registers, and not work (exaple operation_requested variable chenged in interrupt, but in main sweep cycle optimised as register use, and not recive commands)

DiSlord commented 4 years ago

Found good way to reduce flash usage, default cal_data and _frequencies not need, but they stored in data section (and use huge amount flash space).

Now on error caldata_recall, loaded default settings