Closed sarahmish closed 4 years ago
Hi @sarahmish
Capturing debug information sounds good, but I would set this as an optional feature rather than the default behavior.
More precisely, when it comes to debugging I think that it would make sense to enable a debug mode (fit(..., debug=True)
and predict(..., debug=True)
for the pipeline which, if enabled, makes the pipeline return a dict
with information about what happened in each step, including the elapsed time but also input and output arguments. This would also allow us to later on add other profiling information, such as CPU time vs IO time information or memory consumption.
Record the time it takes to fit each primitive. This feature will become handy in debugging pipelines and understanding where the overhead is.