-X importtime: Shows how long each import takes. It shows module name, cumulative time (including nested imports) and self time (excluding nested imports). Note that its output may be broken in multi-threaded application. Typical usage is python3 -X importtime -c 'import asyncio'
Strategic use of Assert
The assert statement.
Assert statements are only evaluated if debug is True.
When does assert get executed? Does it slow down a python process?
Logging
There is the logging module provided by the standard library.
The logging levels.
Basic usage.
Logging to a file.
The logging level can be set on the command line using the --log flag.
python --log=INFO my_app.py
From inside the program the log level can be gotten (if it's been set) using getattr.
# assuming loglevel is bound to the string value obtained from the
# command line argument. Convert to upper case to allow the user to
# specify --log=DEBUG or --log=debug
numeric_level = getattr(logging, loglevel.upper(), None)
if not isinstance(numeric_level, int):
raise ValueError('Invalid log level: %s' % loglevel)
logging.basicConfig(level=numeric_level, ...)
To overwrite a log file (rather than append)
import logging
# 1. Get the logging level from CLI
numeric_level = getattr(logging, loglevel.upper(), None)
# 2. Configure to overwrite to a file.
logging.basicConfig(filename='example.log', filemode='w', encoding='utf-8', level=numeric_level)
You can specify the display format of the log messages.
Python has names not variables (like in C, C++).
Names -> References -> Objects
A name is just a label for an object. An object can have many names (e.g. x and y both point to 0xEF4).
A reference is a name that points to another object.
The reference count is the number of names that point to a single object.
The del statement removes the reference has to an object.
x = 10
del x # removes the reference of x to the value of 10.
x # Will throw an error saying that x is not defined.
Internally every object has three things.
It's type.
Its reference count (the number of names referring to it).
Its value.
Inspecting Python Memory
The id(...) function outputs the memory location of an object.
x = 5
id(x)
Using the is condition compares two object's memory address.
x = 5
y = 5
x is y # returns True
The sys.getsizeof(...) function returns the size in bytes of a given object.
import sys
x = {'a': 123, 'b': 'hello world'}
sys.getsizeof(x)
Garbage Collection
Reference Counting
Doesn't handle cyclical references.
Not thread safe.
In tree/graph data structures, you want to avoid both parent and children referring to each other. If you do, you need to manually clean up the references when removing nodes.
Tracing
Uses the mark and sweep algorithm for freeing up memory.
Addresses the cyclical reference challenge that reference counting alone can't handle.
Python leverages a generational paradigm for handling tracing (Generations 0, 1, and 2).
The longer an object is "alive" the more references it is likely to acquire. Generations are collected over time.
Why doesn't the Python Program memory consumption go down after the GC runs?
Memory is fragmented. It's not freed in one continuous block.
The memory is freed to be used by Python. Not the operating system.
_slots_
Every Python object as a dict of its names and values.
class Dog:
pass
d = Dog()
d.name = "Buddy"
d.__dict__ #outputs {'name': 'Buddy'}
Slots turn an object's _dict_ into an immutable tuple.
class Point:
__slots__ = (x, y)
p = Point()
p.x = 10
p.y = 14
# The instance p cannot have additional attributes added to it.
# The size of a point with its properties storied as a tuple is much smaller than as a dict.
>>> sys.getsizeof(dict())
232
>>> sys.getsizeof(tuple())
40
Something to consider for data structures like Points and Agents is that there are different ways of reserving data.
Get logging, profiling and debugging established early.
Todo
Logging
Profile
Make Targets
Considerations
Automate with the Makefile. Need to review the various options for command line flags. Flags to consider:
Strategic use of Assert
The assert statement. Assert statements are only evaluated if debug is True.
Logging
There is the logging module provided by the standard library. The logging levels. Basic usage. Logging to a file.
The logging level can be set on the command line using the --log flag.
From inside the program the log level can be gotten (if it's been set) using getattr.
To overwrite a log file (rather than append)
You can specify the display format of the log messages.
More robust example. Logging Handlers
Logging Cookbook
Debugging
Goodby Print, Hello Debugger
PDB
Setting PYTHONBREAKPOINT to zero allows skipping all breakpoints at runtime.
You can do this in a single line when launching a program.
PDB Commands
PDB Cheatsheet
Navigation
Inspection
Breakpoints
Misc
Visual Debuggers
pudb
Gives you a visual debugger in the terminal. Looks kinda like VIM.
VSCode
Things to consider
Profiling
How to inspect memory usage?
Python Memory Basics
Python has names not variables (like in C, C++). Names -> References -> Objects A name is just a label for an object. An object can have many names (e.g. x and y both point to 0xEF4). A reference is a name that points to another object. The reference count is the number of names that point to a single object.
The del statement removes the reference has to an object.
Internally every object has three things.
Inspecting Python Memory
The id(...) function outputs the memory location of an object.
Using the is condition compares two object's memory address.
The sys.getsizeof(...) function returns the size in bytes of a given object.
Garbage Collection
Reference Counting
Tracing
Uses the mark and sweep algorithm for freeing up memory. Addresses the cyclical reference challenge that reference counting alone can't handle. Python leverages a generational paradigm for handling tracing (Generations 0, 1, and 2). The longer an object is "alive" the more references it is likely to acquire. Generations are collected over time.
Why doesn't the Python Program memory consumption go down after the GC runs?
_slots_
Every Python object as a dict of its names and values.
Slots turn an object's _dict_ into an immutable tuple.
Something to consider for data structures like Points and Agents is that there are different ways of reserving data.
How to measure run time?
Simple Time Measurement
Use time.perf_counter_ns() to measure how long something took.
Should create a decorator for the above.
Statistical Time Measurement
The timeit module will run something a bunch of times and find the statistical average of how long it took.
Profile an entire program.
Testing