tl;dr / abstract. System which can be used for localization the position of a cat in a building using bluetooth low energy (BLE) beacons attached to the object, a set of cheap ESP32 detectors, and Machine Learning models.
This is an overwiev of a pipeline for creating an inhouse cat locator. Actually, it can be applied to any animal (including humans) or object, and any building. The system works as follows:
:bulb: The challenge here is to use a number of detectors which is significantly lower than the number of rooms and make ML do the rest.
:bulb: This is not a ready-to-use set of programs. As each situation is different (i.e, hardware setup, house size, number of rooms etc), please treat this repo as a framework to build and customize your own system.
Optional:
Nothing to configure nor code here :-)
I provide a simple C/Arduino code for the detector. Of course it can do much more (my ESPs also measure temperature, humidity and other parameters):
Many things can be tuned and optimized (eg array of MAC addreses instead of single variables, reading MAC addresses from the server etc).
(this solution was not thoroughly tested!)
Raspberry Pi with a BLE capability can be included in the set of detectors. The commands here are:
btmon &
hcitool lescan --duplicates | grep MAC
and next sending to the database.
We need a server with a database. I use influx, but any will be good (MySQL or even text files!)
I used Python 3.8.
Sample conda environment can be impotred by issuing (this is a single time procedure):
conda env create -f cat-localizer.yml
and activated by:
conda activate cat-localizer
Training of ML models can be done on any computer (eg laptop) and next models transfered to the main server. Remember to keep the same modules version (use conda environment). See the next section on model training.
When models are trained and ready (see the next section), we use python program to fetch last BLE signal strength data and - basing on this - predict localization of the cat. Please see the section Detecting for more info and code samples.
Here we collect data for training ML algorithms. This means we have to collect signal strength data from all sensors WITH associated localization of the beacon. I.e, lets assume we have four ESP32 sensors, we need to create table like this one:
measurement id | time | target | esp32-attic | esp32-room_k | esp32-office | esp32-doors |
---|---|---|---|---|---|---|
0 | 2021-01-03 15:32 | office | -71.0 | -110.0 | -71.5 | -83.0 |
1 | 2021-01-03 15:33 | office | -73.0 | -110.0 | -82.5 | -82.5 |
2 | 2021-01-03 15:34 | office | -81.0 | -110.0 | -80.0 | -87.5 |
3 | 2021-01-03 15:35 | office | -75.0 | -110.0 | -86.5 | -86.5 |
4 | 2021-01-03 15:36 | office | -72.5 | -110.0 | -78.5 | -81.0 |
... | ... | ... | ... | ... | ... | ... |
236 | 2021-01-03 21:56 | room_k | -100.5 | -67.5 | -110.0 | -91.5 |
237 | 2021-01-03 21:57 | room_k | -86.0 | -60.0 | -110.0 | -91.5 |
238 | 2021-01-03 21:58 | room_k | -88.5 | -93.5 | -110.0 | -91.5 |
239 | 2021-01-03 21:59 | room_k | -86.0 | -75.0 | -110.0 | -91.5 |
240 | 2021-01-03 22:00 | room_k | -87.0 | -72.0 | -110.0 | -84.5 |
where each row (measurement) represent RSSI values read from beacon at a given location at the same time. Please see jupyter notebook for the data preprocessing.
When this table is ready, we can go further and train ML models.
The number of data points (i.e, different locations of BLE beacon) for each room is variable and depends on the room size. In my case it was like here:
room no of measurements
bathroom_d 11
bathroom_g 30
living_room 101
office 38
room_k 32
room_m 28
Remember to sample space available for your cats, so not only floors, but also tables, wardrobes, or chandelier etc ;-)
Now we will probe several preprocessing algorithms together with several Machine Learning methods. We don't know which one will be the best in our case, so we will probe them all in a cross-validation experiment.
Please see jupyter notebook for the exact pipeline.
The final models will be build with at 75% of the dataset (training), while 25% will seve as a testing dataset.
The ML algos cv results is shown here (balanced accuracy values are shown):
we can see that in our case, the MLP Classifier with Min-Max Scaler gives the best results.
Let's see if we can go any further with some features engeneering:
The nswer is no. The best accuracy here is the same as an initial setup (0.83), so we will stay with the MLP Classifier with Min-Max Scaler.
The final models build with MLP Classifier with Min-Max Scaler gives the balanced accuracy equal 0.88. Very good! If we look at the confusion matrix:
we will see that (believe me or not) all miss-labelled predictions are for locations which are close to each other (eg. room_m is close in space to the office etc.). So this makes a perfect sense.
For the detection you can use the code snippet in the section 6 of jupyter notebook or the code of the simple program. The program uses numpy array as a data source, but it can be read from the file or fetched from the database.