Ruby-Protocol / private_ml

Apache License 2.0
2 stars 3 forks source link

Neural Network lacks actual learning #2

Open musicarroll opened 2 years ago

musicarroll commented 2 years ago

The implementation and subsequent integration with Substrate of the machine learning application fails to demonstrate actual machine learning. The function included in the implementation of the neural network: https://github.com/Ruby-Protocol/private_ml/blob/main/src/ml/neural_network.rs is a single layer (hardly a network at all) that only implements the the inferencing functionality of a NN. It does not include the learning aspect which, IMHO, is crucial to the demonstration of the viability of ML use case. The thousands of training samples needed to train a network (even one as simple as this toy NN) also have privacy issues associated with them and thus may well require encryption. Moreover, this would probably need to be accomplished off-chain since high end parallel processing with GPUs is typically necessary to support this. There is precedent for training with encrypted data (see numerai: https://numer.ai/). Thus the use case presented is far too simplistic to be credible. This is, of course, a higher level issue than can be fixed in this code unit, but I wanted to at least point out the deficiency.

lornacrevelingwgo23 commented 2 years ago

The implementation and subsequent integration with Substrate of the machine learning application fails to demonstrate actual machine learning. The function included in the implementation of the neural network: https://github.com/Ruby-Protocol/private_ml/blob/main/src/ml/neural_network.rs is a single layer (hardly a network at all) that only implements the the inferencing functionality of a NN. It does not include the learning aspect which, IMHO, is crucial to the demonstration of the viability of ML use case. The thousands of training samples needed to train a network (even one as simple as this toy NN) also have privacy issues associated with them and thus may well require encryption. Moreover, this would probably need to be accomplished off-chain since high end parallel processing with GPUs is typically necessary to support this. There is precedent for training with encrypted data (see numerai: https://numer.ai/). Thus the use case presented is far too simplistic to be credible. This is, of course, a higher level issue than can be fixed in this code unit, but I wanted to at least point out the deficiency.

Thanks for your very informative comments. I totally agree that private learning is a computationally consuming process that requires the support of off-chain high-end parallel computing, which I believe merits an independent project. In ruby, we focus on developing a personal data monetization framework for private inference, which is a vital step of machine learning. However, the integration of the private learning step would definitely serve as a great follow-up project.