dask / dask-ml

Scalable Machine Learning with Dask
http://ml.dask.org
BSD 3-Clause "New" or "Revised" License
903 stars 256 forks source link

The latest version doesn't support perceptron model #978

Open linjing-lab opened 1 year ago

linjing-lab commented 1 year ago

As a developer concentrated in mathematical optimization and machine learning approaches, I released a software named perming integrated with PyTorch to deal with supervised learning problems based on perceptron model.

As far as I concerned, Perceptron-based model and algorithm will make the hidden differentiable latent space to represent linearly separable high-dimensional data, so I think it is important to upgrade dask_ml project by developing perceptron algorithm integrated with parallel computing and compiled operators, like activation function included relu, tanh, and so on.

For example, I adopt operators released by PyTorch to make any supervised learning task conform to target and tabular data possible, and I wrap numpy.ndarray dataset to torch.Tensor for a high efficiency in processing cuda computation. Morever, the early_stop stage was involved in training and validation of any algoithm in perming. The underlying code is the simple configuration of perming:

import perming # pip install perming
main = perming.Box(10, 3, (30,), batch_size=8, activation='relu', inplace_on=True, solver='sgd', criterion='MultiLabelSoftMarginLoss', learning_rate_init=0.01)
main.data_loader(X, y, random_seed=0)
main.train_val(num_epochs=60, interval=25, tolerance=1e-4, patience=10, early_stop=True)
main.test()

main.model can be deployed to any pipeline related to predicative task, so I recommend dask_ml to release perceptron model for a more compatible support in processing linear inseparable dataset.