powerapi-ng / pyJoules

A Python library to capture the energy consumption of code snippets
MIT License
69 stars 8 forks source link

Frequent zero consumption for nvidia device #17

Open nikhil153 opened 3 years ago

nikhil153 commented 3 years ago

I am monitoring energy consumption of a pytorch model. I am sampling several times during training loop with EnergyContext and record (code snippet below). I am noticing that there are more than half samples showing zero consumption. See attached partial log.
joules_sample.log

Any ideas?

for i, (images, labels) in enumerate(train_loader):
    # get the inputs; data is a list of [inputs, labels]
    images = images.to(device)
    labels = labels.to(device)

    # zero the parameter gradients
    optimizer.zero_grad()

    # Monitor joules sparingly
    if (i % monitor_interval) == (monitor_interval-1):
        if monitor_joules:
            # pyjoules
            with EnergyContext(handler=pd_handler, start_tag='forward') as ctx:
                # forward + backward + optimize
                outputs = model(images)
                ctx.record(tag='loss')
                loss = criterion(outputs, labels)
                ctx.record(tag='backward')  
                loss.backward()
                ctx.record(tag='step')
                optimizer.step()
                ctx.record(tag='overhead')