Our logs are a nightmare hellscape

We need to log less, and structure the logs a bit more clearly. We don't need to fully fix this, but we should at least deal with the easy cases before the launch.

All of these lines are either unhelpful or fully redundant with other lines in the logs:

04/26 05:26:16 PM: Waiting on git info....
All the task-specific overrides that aren't used. (Things like "edges-coref-ontonotes": { ... })
04/26 05:26:16 PM: Saved config to /Users/Bowman/Drive/JSALT/jiant-demo/mtl-sst-mrpc/params.conf
04/26 05:26:25 PM: Your label namespace was 'idxs'. We recommend you use a namespace ending with 'labels' or 'tags', so we don't add UNK and PAD tokens by default to your vocabulary. See documentation for non_padded_namespaces parameter in Vocabulary.
04/26 05:26:31 PM: Task 'sst': cleared in-memory data.
04/26 05:26:31 PM: Tasks: ['mrpc', 'sst']
04/26 05:26:31 PM: Not using character embeddings!
04/26 05:26:31 PM: Done initializing parameters; the following parameters are using their default initialization from their code 04/26 05:26:31 PM: _text_field_embedder.token_embedder_words.weight
04/26 05:26:31 PM: Using BoW architecture for shared encoder!
04/26 05:26:31 PM: Converting Params object to dict; logging of default values will not occur when dictionary parameters are used subsequently.
04/26 05:26:31 PM: CURRENTLY DEFINED PARAMETERS: ... [followed by a ton of redundant lines]
04/26 05:26:31 PM: Will run the following steps: Evaluating model on tasks: mrpc
04/26 05:26:31 PM: In strict mode because do_target_task_training is off. Will crash if any tasks are missing from the checkpoint.
04/26 05:26:31 PM: Task 'mrpc': sorting predictions by 'idx'
04/26 05:25:40 PM: Sampling tasks proportional to number of training examples.

These are helpful, but need a cleaner format:

04/26 05:26:31 PM: >> Trainable param mrpc_mdl.pooler.project.weight: torch.Size([256, 50]) = 12800

-> Just show the name and size, with some indentation since this is a long list.

04/26 05:25:47 PM: Statistic: sst_accuracy 04/26 05:25:47 PM: training: 0.542500 04/26 05:25:47 PM: validation: 0.516055

-> This can be a single line.

04/26 05:25:47 PM: mrpc: trained on 0 batches, 0.000 epochs 04/26 05:25:47 PM: sst: trained on 50 batches, 0.006 epochs

-> This can be a single line.

Epoch 8: reducing learning rate of group 0 to 2.5000e-05.

-> This is printed rather than being written to the log.

04/26 05:24:33 PM: mrpc_acc_f1, 1, mrpc_loss: 0.76268, sst_loss: 0.69719, macro_avg: 0.44561, micro_avg: 0.35293, mrpc_acc_f1: 0.74803, mrpc_accuracy: 0.68382, mrpc_f1: 0.81223, mrpc_precision: 0.68382, mrpc_recall: 1.00000, sst_accuracy: 0.51720

-> It's not clear what the '1' means.

04/26 05:24:33 PM: micro_avg, 6, mrpc_loss: 0.76168, sst_loss: 0.68541, macro_avg: 0.46682, micro_avg: 0.38183, mrpc_acc_f1: 0.74803, mrpc_accuracy: 0.68382, mrpc_f1: 0.81223, mrpc_precision: 0.68382, mrpc_recall: 1.00000, sst_accuracy: 0.55963 04/26 05:24:33 PM: macro_avg, 6, mrpc_loss: 0.76168, sst_loss: 0.68541, macro_avg: 0.46682, micro_avg: 0.38183, mrpc_acc_f1: 0.74803, mrpc_accuracy: 0.68382, mrpc_f1: 0.81223, mrpc_precision: 0.68382, mrpc_recall: 1.00000, sst_accuracy: 0.55963

-> This is logged twice in some cases. Maybe when the same task is in pretraining and target?

nyu-mll / jiant

Our logs are a nightmare hellscape #602