Breakend / experiment-impact-tracker

MIT License
266 stars 31 forks source link

Empty log files from provided example my_experiment.py #13

Closed forresti closed 4 years ago

forresti commented 4 years ago

I am just getting started using this repo, so my apologies for the naive question. When I run python examples/my_experiment.py, it prints:

loading region bounding boxes for computing carbon emissions region, this may take a moment...
 454/454... rate=553.06 Hz, eta=0:00:00, total=0:00:00, wall=13:23 PSTT
Done!
experiment_impact_tracker.compute_tracker.ImpactTracker - WARNING - Gathering system info for reproducibility...
experiment_impact_tracker.compute_tracker.ImpactTracker - WARNING - Done initial setup and information gathering...
experiment_impact_tracker.compute_tracker.ImpactTracker - WARNING - Starting process to monitor power
experiment_impact_tracker.compute_tracker.ImpactTracker - WARNING - Datapoint timestamp took 0.00012373924255371094 seconds
Pass: 9
Pass: 19
experiment_impact_tracker.compute_tracker.ImpactTracker - WARNING - Datapoint rapl_power_draw_absolute took 2.107639789581299 seconds
Pass: 29
Pass: 39
Pass: 49
Pass: 59
Pass: 69
Pass: 79
Pass: 89
experiment_impact_tracker.compute_tracker.ImpactTracker - WARNING - Datapoint nvidia_draw_absolute took 6.554447650909424 seconds
experiment_impact_tracker.compute_tracker.ImpactTracker - WARNING - Datapoint cpu_count_adjusted_average_load took 0.00011205673217773438 seconds
Pass: 99
Please find your experiment logs in: /tmp/tmpw3feackg

OK, so what's in /tmp/tmpw3feackg?

cd /tmp/tmpw3feackg/impacttracker
ls -lah
total 64
-rw-rw-r-- 1 forrest forrest     0 Jun 15 13:23 data.json
-rw-rw-r-- 1 forrest forrest     0 Jun 15 13:23 impact_tracker_log.log
-rw-rw-r-- 1 forrest forrest 63309 Jun 15 13:23 info.pkl

Hmm, isn't it odd that 2 of the 3 files are empty?

But, let's see what's in the info.pkl file.

# python code
import pickle
with open('/tmp/tmpw3feackg/impacttracker/info.pkl', 'rb') as f:
    x = pickle.load(f)

Here's a pseudocode summary of what's in the pickle file:

{
  'python_package_info': <list of packages in my conda environment>,
  'cpu_info': {includes the model of CPU, the CPU frequency, and the clock frequency},
  L3, L2, and L1 cache size,
  gpu info: {gpu name, total memory, driver version, cuda version} for each GPU,
  'experiment_impact_tracker_version': '0.1.8',

  ... and now, the interesting stuff ...

  'region': {
    'type': 'Feature',
    'geometry': <shapely.geometry.multipolygon.MultiPolygon at 0x7fddd6d8e580>,
    'properties': {'zoneName': 'US-CA'}, 'id': 'US-CA'
  },
  'region_carbon_intensity_estimate': {
    '_source': 'https://github.com/tmrowco/electricitymap-contrib/blob/master/config/co2eq_parameters.json (ElectricityMap Average, 2019)',
    'carbonIntensity': 250.73337617853463,
    'fossilFuelRatio': 0.4888871173733636,
    'renewableRatio': 0.4283732563775541},
    'experiment_end': datetime.datetime(2020, 6, 15, 13, 23, 27, 343920)
  }
}

Questions

  1. When things are working correctly, what information should appear in the data.json and impact_tracker_log.logfiles?
  2. In the pickle file, it strikes me as odd that more elementary stats such as the total energy (in kilowatt-hours) and the total runtime aren't reported. Are those typically reported in the data.json and impact_tracker_log.log files that are empty in my case?
Breakend commented 4 years ago

Hi,

Sorry if this wasn't more clear! We're working on clearing up some issues that might be confusing as people bring them up.

The data.json contains logs about power/energy readings during training. The pickle file is meant for one time information recorded at the start/end of training (e.g., the cpu you're using, etc.).

Since we're using a polling method, you will only see logs showing up in the data.json with energy readings once every few seconds since we're intending this for long-running jobs (and it takes a few seconds to get accurate CPU/GPU readings of the workload).

It looks like you might have an exceptionally fast system so it didn't get a chance to log anything. We'll make the example a bit longer so that it's guaranteed to output something and will try to make this clearer. Tagging #4 to track this. You can also make the loop longer in the example to make sure that this really fixes the issue and logs the correct information. E.g., change this to 1000 or something https://github.com/Breakend/experiment-impact-tracker/blob/master/examples/my_experiment.py#L59

Hope that helps!

forresti commented 4 years ago

@Breakend Thanks for the very fast reply!

Per your suggestion, I reran the experiment with 1000 iterations instead of 100 iterations. After doing that, what I see is:

Is this normal?

Breakend commented 4 years ago

Yup, that's normal. The data.json file contains the energy readings while the .log file is meant for exceptions and that sort of thing so we can debug better.

forresti commented 4 years ago

Great. Thanks again for the help.

I also just wanted to share that I was able to use the provided generate-carbon-impact-statement script to produce a summary of the results.

Specifically, when I do this:

#!/bin/bash

input_dir=/tmp/tmpg_btwq_9
python scripts/generate-carbon-impact-statement $input_dir  "USA"

...it prints:

This work contributed 0.002 kg of $\text{CO}_{2eq}$ to the atmosphere and used 0.008 kWh of electricity, having a USA-specific social cost of carbon of \$0.00 (\$0.00, \$0.00). Carbon accounting information can be found here: \url{<TODO: Insert URL of generated HTML appendix with all info here>}. The social cost of carbon uses models from \citep{ricke2018country} and this statement and carbon emissions information was generated using \emph{experiment-impact-tracker}\citep{henderson2019climate}.