Closed leogargu closed 8 years ago
:+1:
On Thu, Sep 25, 2014 at 08:59:25AM -0700, leogargu wrote:
The issue, however, is that currently we teach loops and conditionals after functions.
Previous discussion of lesson order in #256, but that's about loops vs. scripts.
@leogargu I thought that the reason that we first teach functions is that it is what we want students learn (specially as one of the good practices) so we can use it in our examples for loops and conditionals.
What about
Lesson 1 (currently 01) ["Analyzing Patient Data"]
Keep as it is.
Lesson 2 (currently 02) ["Wrap Analyzing of Patient Data"]
Be short just to introduce basic function syntax so it can be use in other lessons.
Lesson 3 (currently 03) ["Analyzing Multiple Data Sets"]
Keep as it is.
Lesson 4 (currently 04) ["Making Choices"]
Keep as it is.
Lesson 5 (new) ["More About Functions"]
Rewrite the last example with "extra" function features like default value and key word arguments.
Lesson 6 (currently 05) ["Defensive Programming"]
Keep as it is.
Lesson 7 (currently 06) ["Command-Line Programs"]
Keep as it is.
Thanks for getting this discussion started, @leogargu. Overall I like the idea of introducing loops and conditional statements before functions because it is more natural, i.e. if you haven't written code that required loops or conditionals, you likely haven't written enough code to realize that functions would make your life easier.
However, I am -1 on importing the function analyze
as a blackbox for use in the earlier lessons. I see this as putting the cart before the horse. For many novices at our bootcamp, these examples are the most code they have ever written. While we want them to use the best practices from the beginning before they develop any bad habits, I also think it is necessary to show them the problem for which we are giving them the solution.
One of the reasons I like the novice lessons so much is that it attempts to mirror a real data analysis. In the first lesson we interactively explore one of our datasets and write some code that we consider useful:
import numpy as np
from matplotlib import pyplot as plt
data = np.loadtxt(fname='inflammation-01.csv', delimiter=',')
plt.figure(figsize=(10.0, 3.0))
plt.subplot(1, 3, 1)
plt.ylabel('average')
plt.plot(data.mean(0))
plt.subplot(1, 3, 2)
plt.ylabel('max')
plt.plot(data.max(0))
plt.subplot(1, 3, 3)
plt.ylabel('min')
plt.plot(data.min(0))
plt.tight_layout()
plt.show()
Great. But then we realize that we have 11 more files to analyze. How should we proceed? Well the first thing to come to mind is simply to copy-paste the code and replace the filename each time. The instructor could explain that this is not ideal because it is tedious, error-prone, and will make it more difficult to update the code in the future since we have 12 versions of it. Instead, we would start a lesson on for
loops and end with the following:
import glob
filenames = glob.glob('*.csv')
for f in filenames:
print f
data = np.loadtxt(fname=f, delimiter=',')
plt.figure(figsize=(10.0, 3.0))
plt.subplot(1, 3, 1)
plt.ylabel('average')
plt.plot(data.mean(0))
plt.subplot(1, 3, 2)
plt.ylabel('max')
plt.plot(data.max(0))
plt.subplot(1, 3, 3)
plt.ylabel('min')
plt.plot(data.min(0))
plt.tight_layout()
plt.show()
This works, but it always runs on all 12 samples. Also, it is a lot code to read and figure out what it is doing. This is the motivation for writing it as a function. This will allow us to run it on one file at a time when we are testing new features and then easily put it in a loop to run over many files. Also, it will be much easier to follow our code when it is written:
for f in filenames:
print f
analyze(f)
I also like @r-gaia-cs's idea about introducing simple functions and then saving the discussion of default arguments until after learning conditional statements. So I would advocate something like:
interactively exploring data -> loops -> simple functions -> conditional statements -> advanced functions
Hi everybody,
@gvwilson pointed out to me during the instructor training that the examples in the python functions lesson are not very compelling, so I have been trying come up with some more attractive functions (resembling more an "authentic task"). The issue, however, is that currently we teach loops and conditionals after functions. https://github.com/swcarpentry/bc/tree/master/novice/python Without loops and conditionals, I have been unable to write a function that does anything interesting and is easy to integrate with the rest of the lesson. A way to bypass this would be to move the function lesson to later. This is not a trivial change because the patient data example, which is spread across lessons 01 to 04, would need to be changed heavily (and not in a good way).
So I would like know what you think of the following alternative: define the
analyze()
function in a file that can be imported and used as a black box during the loop and conditional lessons, and explain the function definition later. The logic behind this approach is that we have already introduced modules and how to use functions (in lesson 01numpy
is loaded and we callmean()
andloadtxt()
).Here is how the new lesson order would look like, and a full list of the changes it'd require:
analyze()
(as written at the start of lesson 1, "Analysing multiple data sets"), to be imported in the (new) lesson 2.Most of the changes are superficial so this change does not involve much work, but I don't know whether teaching this with the function hidden in a file would be a good idea. What do you think?