NVIDIA / DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
https://docs.nvidia.com/deeplearning/dali/user-guide/docs/index.html
Apache License 2.0
4.98k stars 609 forks source link

Checkpoint refactoring - recognize checkpoints by operator instance name. #5503

Closed mzient closed 2 weeks ago

mzient commented 3 weeks ago

Category:

Refactoring (Redesign of existing code that doesn't affect functionality)

Description:

Use operator names rather than incidental order in checkpoints. Get operator instances from executor (by name) rather than from lowered graph (by index). Rationale:

Additionally, OpCheckpoint is now only forward-declared in operator.h, because most operators are stateless and don't need to depend on OpCheckpoint definition.

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

Tests adjusted (not added)

Checklist

Documentation

DALI team only

Requirements

REQ IDs: N/A

JIRA TASK: DALI-3979

dali-automaton commented 3 weeks ago

CI MESSAGE: [15594081]: BUILD STARTED

dali-automaton commented 3 weeks ago

CI MESSAGE: [15594721]: BUILD STARTED

dali-automaton commented 3 weeks ago

CI MESSAGE: [15594721]: BUILD FAILED

dali-automaton commented 3 weeks ago

CI MESSAGE: [15740861]: BUILD STARTED

dali-automaton commented 3 weeks ago

CI MESSAGE: [15740861]: BUILD PASSED

dali-automaton commented 2 weeks ago

CI MESSAGE: [15766148]: BUILD STARTED

dali-automaton commented 2 weeks ago

CI MESSAGE: [15768135]: BUILD STARTED

dali-automaton commented 2 weeks ago

CI MESSAGE: [15768135]: BUILD PASSED