The code is well written but comments/markdowns to make clear what you are doing are missing.
The obtained result (considering that your trained player always plays first) suggests that your player is not always able to play the game, since (always starting) he should achieve in most of the cases all wins and some draws in a game like Tic-tac-toe.
The use of a clever player to train your player is a smart way to update your value dictionary in the most critical points.
Some advice:
You could add a class to collect some functions and make the code more organized and readable.
An example of a single game played by your player would be a nice idea to show how he moves.
You could consider making your player also train against himself during a second part of the training, in order to refine his strategies.
Some considerations:
Some advice: