filipsPL / cat-localizer

Localize your cat at home with BLE beacon, ESP32s, and Machine Learning
GNU General Public License v3.0
593 stars 30 forks source link

Localize your cat at home with BLE beacon, ESP32s, and Machine Learning

tl;dr / abstract. System which can be used for localization the position of a cat in a building using bluetooth low energy (BLE) beacons attached to the object, a set of cheap ESP32 detectors, and Machine Learning models.

How it works

This is an overwiev of a pipeline for creating an inhouse cat locator. Actually, it can be applied to any animal (including humans) or object, and any building. The system works as follows:

:bulb: The challenge here is to use a number of detectors which is significantly lower than the number of rooms and make ML do the rest.

:bulb: This is not a ready-to-use set of programs. As each situation is different (i.e, hardware setup, house size, number of rooms etc), please treat this repo as a framework to build and customize your own system.

Hardware setup

Optional:

Software setup

BLE beacon

Nothing to configure nor code here :-)

Detectors

ESP32

I provide a simple C/Arduino code for the detector. Of course it can do much more (my ESPs also measure temperature, humidity and other parameters):

Many things can be tuned and optimized (eg array of MAC addreses instead of single variables, reading MAC addresses from the server etc).

Raspberry Pi

(this solution was not thoroughly tested!)

Raspberry Pi with a BLE capability can be included in the set of detectors. The commands here are:

btmon &
hcitool lescan --duplicates | grep MAC

and next sending to the database.

On the server

0. Data source

We need a server with a database. I use influx, but any will be good (MySQL or even text files!)

1. Python environment

I used Python 3.8.

Sample conda environment can be impotred by issuing (this is a single time procedure):

conda env create -f cat-localizer.yml

and activated by:

conda activate cat-localizer

Training of ML models can be done on any computer (eg laptop) and next models transfered to the main server. Remember to keep the same modules version (use conda environment). See the next section on model training.

2. Program for cat localization prediction

When models are trained and ready (see the next section), we use python program to fetch last BLE signal strength data and - basing on this - predict localization of the cat. Please see the section Detecting for more info and code samples.

Training and testing of ML models

Data collection

Here we collect data for training ML algorithms. This means we have to collect signal strength data from all sensors WITH associated localization of the beacon. I.e, lets assume we have four ESP32 sensors, we need to create table like this one:

measurement id time target esp32-attic esp32-room_k esp32-office esp32-doors
0 2021-01-03 15:32 office -71.0 -110.0 -71.5 -83.0
1 2021-01-03 15:33 office -73.0 -110.0 -82.5 -82.5
2 2021-01-03 15:34 office -81.0 -110.0 -80.0 -87.5
3 2021-01-03 15:35 office -75.0 -110.0 -86.5 -86.5
4 2021-01-03 15:36 office -72.5 -110.0 -78.5 -81.0
... ... ... ... ... ... ...
236 2021-01-03 21:56 room_k -100.5 -67.5 -110.0 -91.5
237 2021-01-03 21:57 room_k -86.0 -60.0 -110.0 -91.5
238 2021-01-03 21:58 room_k -88.5 -93.5 -110.0 -91.5
239 2021-01-03 21:59 room_k -86.0 -75.0 -110.0 -91.5
240 2021-01-03 22:00 room_k -87.0 -72.0 -110.0 -84.5

where each row (measurement) represent RSSI values read from beacon at a given location at the same time. Please see jupyter notebook for the data preprocessing.

When this table is ready, we can go further and train ML models.

The number of data points (i.e, different locations of BLE beacon) for each room is variable and depends on the room size. In my case it was like here:

room           no of measurements
bathroom_d      11
bathroom_g      30
living_room    101
office          38
room_k          32
room_m          28

Remember to sample space available for your cats, so not only floors, but also tables, wardrobes, or chandelier etc ;-)

Algorithms evaluation and models training

Now we will probe several preprocessing algorithms together with several Machine Learning methods. We don't know which one will be the best in our case, so we will probe them all in a cross-validation experiment.

Please see jupyter notebook for the exact pipeline.

The final models will be build with at 75% of the dataset (training), while 25% will seve as a testing dataset.

Conclusions

The ML algos cv results is shown here (balanced accuracy values are shown):

we can see that in our case, the MLP Classifier with Min-Max Scaler gives the best results.

Let's see if we can go any further with some features engeneering:

The nswer is no. The best accuracy here is the same as an initial setup (0.83), so we will stay with the MLP Classifier with Min-Max Scaler.

The final models build with MLP Classifier with Min-Max Scaler gives the balanced accuracy equal 0.88. Very good! If we look at the confusion matrix:

we will see that (believe me or not) all miss-labelled predictions are for locations which are close to each other (eg. room_m is close in space to the office etc.). So this makes a perfect sense.

Detecting

For the detection you can use the code snippet in the section 6 of jupyter notebook or the code of the simple program. The program uses numpy array as a data source, but it can be read from the file or fetched from the database.

Tips and tricks

Improvements of the accuracy

Further directions

Acknowledgements