UCL-RITS / rse-classwork-2020

4 stars 113 forks source link

Measuring performance and using numpy #185

Open ageorgou opened 3 years ago

ageorgou commented 3 years ago

Approximating π using pure Python and numpy

There are many ways in which one can write a program that does the same thing, and there are advantages and disadvantages to each. In some cases, it is an advantage for a program to execute quickly.

This exercise is the first of a series of four exercises that look at execution time of four different ways to calculate π using the same Monte Carlo approach. In this approach, π is approximated by sampling n random points inside a square with side length 1, computing the proportion of those points that fall inside the unit circle, and multiplying that by 4/n.

image

This exercise initially uses pure Python to accomplish this approximation of π. The code is already written, and you can find it in calc_pi.py on the week10 branch of this repository. Your job is to understand the code, measure how much time it takes to complete, and then adapt it to use numpy instead of pure Python.

Measuring how long code takes using timeit

The code uses the timeit module from the standard library. There are different ways you can use timeit: either as a module or from its own command-line interface. Check out timeit's documentation to see the different possibilities. Our calc_pi.py wraps the module implementation of timeit, and provides a similar interface than the command line interface provided by timeit.

Your task:

  1. As it is now you can run the file with or without arguments. Run it!
  2. Run it now with some arguments (use --help or look at the source code to know which arguments you can use)
  3. In case you would like to time a function (like calculate_pi in this case) without writing all that boilerplate, you can run
    python -m timeit -n 100 -r 5 -s "from calc_pi import calculate_pi_timeit" "calculate_pi_timeit(10_000)()"

    Try it!

  4. Try to understand the source code more in-depth:
    • What does calculate_pi_timeit function do?
    • How does timeit.repeat work?
    • Why do we repeat the calculation multiple times?
    • Do you think some changes that could make the code faster?

Using numpy

The course notes describe how using the numpy library can lead to faster and more concise code.

Your task:

  1. Create a new file calc_pi_np.py that does the same as calc_pi.py, but uses numpy arrays instead of lists. Update the functions accordingly Hint: Instead of creating n x and y values independently, generate a (n, 2) array.
  2. Which version of the code, the one that uses numpy or the one that uses pure Python, is faster?