Improving documentation on each run

CarolinaFurtado commented 3 years ago

First idea that comes to mind would be to add an input argument for debugging/ tracking purpose of a specific run, such as:

python3 train_segmentation_model.py --gcp-bucket gs://necstlab-sandbox --config-file configs/config_sandbox/tf_version_debug//train-small-3class_tfversion_debug.yaml --message "trial run to test something" (defaulted to None)

Then this information would be added to the metadata file.

It would be helpful when debugging and also if we are running many things at the same time. Thoughts? @Josh-Joseph @rak5216

CarolinaFurtado commented 3 years ago

We could also add hardware and software versions of whatever is being used

CarolinaFurtado commented 3 years ago

System and version information

created a file for systemrelated information: metadata_sys.yaml it includes:

linux version,
ram
cpu
gpu
host
python, tensorflow and segmentation_models versions

Message

Also, added a new argument to train_segmentation_model.py: --message (default to None)

    argparser.add_argument(
        '--message',
        type=str,
        default=None,
        help='A str message the used wants to leave, the default is None.')

This information is added to the metadata.yaml file

will do the same for test_segmentation_model.py, train_segmentation_prediction_thresholds.py and infer_segmentation.py

CarolinaFurtado commented 3 years ago

Train

New file: `metadata_sys_file_name = 'metadata_sys.yaml''

Train_threshold

joined to output file: output_file_name = 'model_thresholds_' + datetime.now(pytz.UTC).strftime('%Y%m%dT%H%M%SZ') + '.yaml'

output_data = {
    'final_trained_prediction_thresholds': trained_prediction_thresholds,
    'metadata': metadata,
    'metadata_sys': metadata_sys
}

with Path(train_thresh_id_dir, output_file_name).open('w') as f:
    yaml.safe_dump(output_data, f)

Test

New file: metadata_sys_file_name = 'metadata_sys_' + test_datetime + '.yaml'

Infer

new file: metadata_sys_file_name = 'metadata_sys_' + infer_datetime + '.yaml'

CarolinaFurtado commented 3 years ago

Should I just add the metadata_systo the metadatafile that already exists? Might become to messy, since the files have a lot of information already.

@rak5216 @Josh-Joseph, comments on this topic?

rak5216 commented 3 years ago

three thumbs up! i like this! let's put everything into one metadata file, especially for reliability/simplicity in pretrained models and transferring their details. i think simplest way is fine: take the metadata_system (change to 'system') and:

metadata = {
    .....
    'metadata_system': metadata_system
}

and then in train thresholds the output_data lines don't change since metadata_system is already inside metadata, leaving it as follows: output_data = { 'final_trained_prediction_thresholds': trained_prediction_thresholds, 'metadata': metadata }

also, it would be super cool to make the message the ssh window title while a given script is running so it's easy to manage several windows at once. but that doesn't seem straightforward

CarolinaFurtado commented 3 years ago

just joined the matedata_system to metadata! good to go once the other pull request is approved.

mit-quest / necstlab-damage-segmentation