Try to save a pre-trained Q-learning model, and update the model only when necessary.
Profile the Q-learning process. Which parts took the longest time to run?
If the feature generations took a lot of time. There is strong motivation to use Q-learning to estimate the results of a potential tract-flip, rather than actually flipping.
If the NB model training used most of the time, then what is the conclusion?
Try to save a pre-trained Q-learning model, and update the model only when necessary.
Profile the Q-learning process. Which parts took the longest time to run?