brainflow-dev / brainflow

BrainFlow is a library intended to obtain, parse and analyze EEG, EMG, ECG and other kinds of data from biosensors
https://brainflow.org/
MIT License
1.28k stars 328 forks source link

Feature Request: Making ML models CUDA compatible #114

Closed MatthewAwesome closed 3 years ago

MatthewAwesome commented 3 years ago

Hi there,

Great project. Was looking to do some stuff with Brainflow and Machine Learning. Specifically, I'd like to accelerate the inferencing of the ML models with CUDA on an NVIDIA Jetson.

Willing to work with you guys on this. Assuming the models are pre-trained and exist in a common format, it should be straight-forward.

Any thoughts or information on the models as they stand?

Thanks for being awesome!

Andrey1994 commented 3 years ago

Hi!

As of right now we have only 2 models - Log Regression and KNN, Log Regression is very simple and fast there is nothing to parallel. For KNN we use single threaded code and run it on CPU but we use fast KDTree algorithm instead brute force. So in terms of perf I think its fine for real time apps on Nvidia Jetson. And to speed up it I would rather parallel CPU code first and it will work on all platforms.

We are going to add SVM and LDA classifiers next and CPU versions first. Not sure that GPU will be required for them also.

But we are going to add the ability to run user defined nets using ONNX(#102) and it can be complicated Neural Network, for such task CUDA makes sense. Not sure that ONNX Runtime works on Nvidia Jetson but we will check

Also, I have Nvidia Jetson and can run and test

Andrey1994 commented 3 years ago

And in our tests KNN and SVM work better than LR on existing data but just a little(a few percents for F1 score) so if perf is crucial you can always switch to LR classifier

Andrey1994 commented 3 years ago

btw BrainFlow returns plain 2d arrays(numpy arrays in case of python), so you can use BrainFlow's data acquisition api and signal processing and use your own code for ML stuff

MatthewAwesome commented 3 years ago

Andrey, thank you for the quick response.

A big fan of BrainFlow.

Off the top, what do your training sets consist of? (e.g., what device was the data recorded on?)

I've been using the BrainBit headband to record some data, and looking to feed it into a model.

Thanks!

And yes, and you angled at, I'm looking to do some real-time processing. Ideally I'm looking to run multiple models in parallel (e.g., one for focus; another for blink detection; etc.)

Stay Awesome!

Andrey1994 commented 3 years ago

Idea here is to train it on band powers and using different devices, this way resulting classifier should be more or less device agnostic and its exactly what I want to achieve in BrainFlow. BrainBit was used in data collection as well. With data augmentation tricks in dataset for training there are ~30k samples for each class(60k total). Also you can tune this classifiers - chose channels you wanna use and set different thresholds since all if them return probability instead binary values. Or you can analyze that concentration increased\decreased.

But BrainBit is not supported for Nvidia Jetson, for BrainBit I use libraries provided by BrainBit developers and they have not provided linux x86 linux ARM libs

Andrey1994 commented 3 years ago

To get data from BrainBit on Nvidia Jetson you can use Streaming Board but it will require a process running on PC and network connection between PC and Jetson

MatthewAwesome commented 3 years ago

Dang I didn't predict issues using the BrainBit on the Jetson. I have an OpenBCI Ganglion, too, which may work have to do for now.

Would you mind sharing the training set? I couldn't seem to locate it in repo.

Thanks for being awesome!

Andrey1994 commented 3 years ago

I am still not sure that I will make this dataset available for everyone for free. As of right now its not publicly available.

Here is a link to check supported boards and OSes https://brainflow.org/get_started/?manufactorer=neuromd&board=brainbit&

MatthewAwesome commented 3 years ago

Understandable; there is value to be had training data for ML algorithms.

That link is super helpful. Thank you!