DAGWorks-Inc / hamilton

Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
https://hamilton.dagworks.io/en/latest/
BSD 3-Clause Clear License
1.89k stars 125 forks source link

[ergonomics] profile driver and base module imports #1246

Open skrawcz opened 2 hours ago

skrawcz commented 2 hours ago

Current behavior

It takes ~2 seconds to do:

from hamilton import driver, base

We should profile this and see where the time is being spent.

Library & System Information

Latest hamilton on macOS, python 3.10.

Expected behavior

It should load in < 1 second.

Additional context

This is a quality of life task.

This task requires:

  1. to use a profiler to figure out where time is being spent
  2. report the results
  3. provide recommendations
elijahbenizzy commented 2 hours ago

Blech, pandas gets imported regardless. Second time importing after a fresh install it's faster, but still not great.

First time:

image

Second time:

image
skrawcz commented 1 hour ago

import pandas as pd

Takes ~1 second for me. So that's not all of the issue.