Embench tflm dse updates

This PR has three main changes: (1) fixing a legacy typo in the word from INLCUDE to INCLUDE across the repo's files for MLPerf Tiny Workloads (2) updating the way profiling is done of the Embench workloads to be correct and consistent with the TFLM workloads (3) adding the ability to change the workload in the dse framework as well as verify the Embench and TFLM benchmarks are still passing during design space exploration.

All changes have been tested in the following way to ensure correctness: (1) Do a clean build to make sure typo changes to INCLUDE picked up and run golden tests (2) Run profiling of embench workloads to make sure still working (3) Run python dse_framework.py with workload (line 195 of new dse_framework.py version) set to micro_speech and primecount. I have included the output when running micro_speech to show that the dse framework is able to take a different workload and run successfully to completion (previously only supported pdti8 by default) :

--============== Boot ==================--
Booting from serial...
Press Q or ESC to abort boot completely.
sL5DdSMmkekro
Timeout
Executing booted program at 0x40000000

--============= Liftoff! ===============--
Hello, World!

CFU Playground
==============
 1: TfLM Models menu
 2: Functional CFU Tests
 3: Project menu
 4: Performance Counter Tests
 5: TFLite Unit Tests
 6: Benchmarks
 7: Util Tests
 8: Embench IoT
 t: trace (only works in simulation)
 Q: Exit (only works in simulation)
main> 1

Running TfLM Models menu

TfLM Models
===========
 1: Micro Speech
 x: eXit to previous menu
models> 1

Running Micro Speech
Error_reporter OK!
Input: 1960 bytes, 2 dims: 1 1960

Tests for micro_speech model
============================
 1: Run with zeros input
 2: Run with "no" input
 3: Run with "yes" input
 g: Run golden tests (check for expected outputs)
 x: eXit to previous menu
micro_speech> g

Running Run golden tests (check for expected outputs)
Zeroed 1960 bytes at 0x400a08f0
Running micro_speech
....
"Event","Tag","Ticks"
0,RESHAPE,19
1,DEPTHWISE_CONV_2D,13504
2,FULLY_CONNECTED,194
3,SOFTMAX,10
Perf counters not enabled.
    14M (     14061082 )  cycles total
Copied 1960 bytes at 0x400a08f0
Running micro_speech
....
"Event","Tag","Ticks"
0,RESHAPE,20
1,DEPTHWISE_CONV_2D,13505
2,FULLY_CONNECTED,195
3,SOFTMAX,9
Perf counters not enabled.
    14M (     14061744 )  cycles total
Copied 1960 bytes at 0x400a08f0
Running micro_speech
....
"Event","Tag","Ticks"
0,RESHAPE,19
1,DEPTHWISE_CONV_2D,13505
2,FULLY_CONNECTED,194
3,SOFTMAX,9
Perf counters not enabled.
    14M (     14061416 )  cycles total
OK   Golden tests passed
---

Tests for micro_speech model
============================
 1: Run with zeros input
 2: Run with "no" input
 3: Run with "yes" input
 g: Run golden tests (check for expected outputs)
 x: eXit to previous menu
micro_speech> x
---

TfLM Models
===========
 1: Micro Speech
 x: eXit to previous menu
models> x
---

CFU Playground
==============
 1: TfLM Models menu
 2: Functional CFU Tests
 3: Project menu
 4: Performance Counter Tests
 5: TFLite Unit Tests
 6: Benchmarks
 7: Util Tests
 8: Embench IoT
 t: trace (only works in simulation)
 Q: Exit (only works in simulation)
main> Q

Running Exit (only works in simulation)
Goodbye!
NUMBER OF CYCLES: 14061744.0
NUMBER OF CELLS:  22146

EXITING DSE...

To test if the workload verification is working, I simply changed the primecount workload to return the wrong value and you can see the following output below shows that the dse framework detects that the benchmark is no longer passing: ``` Simulation completed but program test failed! Modifications need to be made to CFU HW or SW. NUMBER OF CYCLES: inf NUMBER OF CELLS: inf

EXITING DSE...


We return infinity so that this CPU + CFU config is viewed as invalid / very bad but does not prevent the dse search from continuing. 

You can run these tests if you would like to also confirm!

google / CFU-Playground

Embench tflm dse updates #802