Open serkosi opened 2 years ago
Well noted.
I will prepare a document consisting of RBM structure for our application. I will try to make it as clear as possible with every step, every function that will be implemented in the future for our RBM to work.
Step by step implementation of our RBM model.
Which dataset are you going to use? MovieLens 25M Dataset? MovieLens Tag Genome Dataset 2021? Or one of the ones for education and development or older datasets?
Visible node sampling function carries the business logic characteristics, am I right? This function is the one we will modify when it comes to use it for warehouse domain rather than movie domain.
The datasets I downloaded for now is MovieLens 100k and MovieLens 1M. I will do the initial testing with these datasets and move up to a larger ones if everything goes fine.
Visible and hidden node sampling is for activating those nodes on based on probability, given the previous layer. Probability of visible layer is the sigmoid activation of hidden nodes. thus we multiply the hidden nodes by weight and add bias to them.
Well-noted. After you finalise the algorithm pipeline in DNN project folder and get ready for testing attempt, I will check in more detail and we can have further discussions.
Our model uses Bernoulli Restricted Boltzmann Machines, a RBM with binary visible units and binary hidden units. For energy function, I am using K-step contrastive divergence.
I created an issue from my last comment for us to discuss if it makes sense or not.
Perhaps, it would be good idea to decide for a framework that we can follow while you work on ML pipeline and then we can identify the differences on both pipelines (ML for movies and ML for warehouses) through that framework steps.
1- Understand the application domain and the goal of the process 2- Create target dataset as a subset of all the data that is available 3- Data cleaning and preprocessing to remove noise, handling missing data and outliers 4- Data reduction and projection in order to focus on the features that are relevant to the problem 5- Match goals of process to the RBM method. 6- Decide the purpose of the model such as summarization or classification. 7- Machine learning, i.e. run algorithms on data. 8- Interpretation of learned patterns to make them understandable by the user, such as summarization and visualization. 9- Acting on the discovered knowledge, such as reporting or making decisions.