UCL-COMP0233-2023-2024 / RSE-Classwork

3 stars 65 forks source link

Measuring performance and using numpy #49

Open dpshelio opened 10 months ago

dpshelio commented 10 months ago

This exercise is the first in a series. The series will look at the execution time of calculating π using the same algorithm, but different implementations and libraries.

This exercise initially uses pure Python to accomplish this approximation of π. The code is already written, and you can find it in the pi_calculation repository. Your job is to understand the code, measure how much time it takes to complete, and then adapt it to use numpy instead of pure Python.

Step 1: Measuring how long code takes using timeit

The code uses the timeit module from the standard library. There are different ways you can use timeit: either as a module or from its own command-line interface. Check out timeit's documentation to see the different possibilities. Our calc_pi.py wraps the module implementation of timeit, and provides a similar interface to the command line interface of timeit.

Your task:

  1. Run the code with the default values using python calc_pi.py.
  2. Now run it by specifying values for some arguments (to see which arguments you can use, use --help or look at the source code )
  3. In case you would like to time a function (like calculate_pi in this case) without writing all that boilerplate, you can run
    python -m timeit -n 100 -r 5 -s "from calc_pi import calculate_pi_timeit" "calculate_pi_timeit(10_000)()"

    Try it!

  4. Try to understand the source code in more depth:
    • What does calculate_pi_timeit function do, roughly?
    • How does timeit.repeat work?
    • Why do we repeat the calculation multiple times?
    • Can you think of any changes that could make the code faster?

Step 2: Using numpy

The course notes describe how using the numpy library can lead to faster and more concise code.

Your task:

  1. Complete the file calc_pi_np.py so that it does the same as calc_pi.py, but uses numpy arrays instead of lists. Update the functions accordingly (you can change their arguments if it makes more sense for your new version). Hint: Instead of creating n x and y values independently, generate an array of size (n, 2).
  2. Which version of the code is faster, the one that uses numpy or the one that uses pure Python?

When you have completed the exercise, react to this issue using the available emojis, or post your comparison of times below!