I think this part may be little bit difficult to follow for someone not familiar with Bayesian statistics. You may consider adding some example about likelihood and prior and effects on determined parameters after you introduce Bayes theorem, so that students can gain some intuition.
There is some inconsistency of using word evidence. Some use in the context you used while others refer to model evidence for normalization constant. I once ended up in the longish discussion about it and then it turned out that we are talking about different evidence.
How many layers exercise - missing image bayes_exercise_2._0.png
I have dealt with the first point in 2940969 with a link to a YouTube video I like. I will cover the differences between evidence in the lecture component