sholloway / agents-playground

MIT License
4 stars 0 forks source link

Logging, Profiling, & Debugging #4

Closed sholloway closed 2 years ago

sholloway commented 2 years ago

Get logging, profiling and debugging established early.

Todo

Logging

Profile

Make Targets

Considerations

Automate with the Makefile. Need to review the various options for command line flags. Flags to consider:

Strategic use of Assert

The assert statement. Assert statements are only evaluated if debug is True.

Logging

There is the logging module provided by the standard library. The logging levels. Basic usage. Logging to a file.

The logging level can be set on the command line using the --log flag.

python --log=INFO my_app.py 

From inside the program the log level can be gotten (if it's been set) using getattr.

# assuming loglevel is bound to the string value obtained from the
# command line argument. Convert to upper case to allow the user to
# specify --log=DEBUG or --log=debug
numeric_level = getattr(logging, loglevel.upper(), None)
if not isinstance(numeric_level, int):
    raise ValueError('Invalid log level: %s' % loglevel)
logging.basicConfig(level=numeric_level, ...)

To overwrite a log file (rather than append)

import logging

# 1. Get the logging level from CLI
numeric_level = getattr(logging, loglevel.upper(), None)

# 2. Configure to overwrite to a file. 
logging.basicConfig(filename='example.log', filemode='w', encoding='utf-8', level=numeric_level)

You can specify the display format of the log messages.

More robust example. Logging Handlers

Logging Cookbook

Debugging

Goodby Print, Hello Debugger

PDB

Setting PYTHONBREAKPOINT to zero allows skipping all breakpoints at runtime.

export PYTHONBREAKPOINT=0

You can do this in a single line when launching a program.

PYTHONBREAKPOINT=0 python myapp.py

PDB Commands

PDB Cheatsheet

Navigation

Inspection

Breakpoints

Misc

Visual Debuggers

pudb

Gives you a visual debugger in the terminal. Looks kinda like VIM.

VSCode

  1. Gotta install the Python Extension.
  2. Gotta create a debugger configuration.

Things to consider

Profiling

How to inspect memory usage?

Python Memory Basics

Python has names not variables (like in C, C++). Names -> References -> Objects A name is just a label for an object. An object can have many names (e.g. x and y both point to 0xEF4). A reference is a name that points to another object. The reference count is the number of names that point to a single object.

The del statement removes the reference has to an object.

x = 10
del x # removes the reference of x to the value of 10.
x # Will throw an error saying that x is not defined.

Internally every object has three things.

  1. It's type.
  2. Its reference count (the number of names referring to it).
  3. Its value.

Inspecting Python Memory

The id(...) function outputs the memory location of an object.

x = 5
id(x)

Using the is condition compares two object's memory address.

x = 5
y = 5
x is y # returns True

The sys.getsizeof(...) function returns the size in bytes of a given object.

import sys

x = {'a': 123, 'b': 'hello world'}
sys.getsizeof(x)

Garbage Collection

Reference Counting
Tracing

Uses the mark and sweep algorithm for freeing up memory. Addresses the cyclical reference challenge that reference counting alone can't handle. Python leverages a generational paradigm for handling tracing (Generations 0, 1, and 2). The longer an object is "alive" the more references it is likely to acquire. Generations are collected over time.

Why doesn't the Python Program memory consumption go down after the GC runs?
_slots_

Every Python object as a dict of its names and values.

class Dog:
  pass

d = Dog()
d.name = "Buddy"

d.__dict__ #outputs {'name': 'Buddy'}

Slots turn an object's _dict_ into an immutable tuple.

class Point:
  __slots__ = (x, y)

p = Point()
p.x = 10
p.y = 14

# The instance p cannot have additional attributes added to it. 
# The size of a point with its properties storied as a tuple is much smaller than as a dict.
>>> sys.getsizeof(dict())
232
>>> sys.getsizeof(tuple())
40

Something to consider for data structures like Points and Agents is that there are different ways of reserving data.

How to measure run time?

Simple Time Measurement

Use time.perf_counter_ns() to measure how long something took.

import time

start = time.perf_counter_ns()
do_a_bunch_of_stuff()
end  = time.perf_counter_ns()
duration = end - start

Should create a decorator for the above.

Statistical Time Measurement

The timeit module will run something a bunch of times and find the statistical average of how long it took.

Profile an entire program.

Testing