Open rusheb opened 1 year ago
In get_dataloader in training.py we are adding dataset_seq_length_stats to the w&b summary which produces a lot of clutter.
get_dataloader
dataset_seq_length_stats
We should just log this to the console and remove it from the summary.
The dataset_seq_length_stats_summary is also being logged twice
dataset_seq_length_stats_summary
logger.summary({"dataset_seq_len_stats_summary": length_stats.summary()}) logger.summary( {"dataset_seq_len_stats": length_stats.serialize(typecast=lambda x: str(x))} )
this is because the call to lengh_stats.serialize also includes the summary. We should remove the redundant first line.
lengh_stats.serialize
Seems to not happen anymore in the code; unsure, low-confidence, likely wrong, but also probably worth simply de-prioing
In
get_dataloader
in training.py we are addingdataset_seq_length_stats
to the w&b summary which produces a lot of clutter.We should just log this to the console and remove it from the summary.
The
dataset_seq_length_stats_summary
is also being logged twicethis is because the call to
lengh_stats.serialize
also includes the summary. We should remove the redundant first line.