dump the training dynamics to json

Hi,

Thanks for the nice work!

I have a question about L149: https://github.com/allenai/cartography/blob/c7865383e421a91611c2f4e79d1ffbfb7850f4f4/cartography/selection/train_dy_filtering.py#L149

I don't understand why you are enumerating over correctness_. I might misunderstand something, but I think you should iterate over all the guids instead. Otherwise, you cannot dump the statistics of the entire training set as guid in this loop only has 1 + Epoch possible values.

  df = pd.DataFrame([[guid,
                      i,
                      threshold_closeness_[guid],
                      confidence_[guid],
                      variability_[guid],
                      correctness_[guid],
                      forgetfulness_[guid],
                      ] for i, guid in enumerate(correctness_)], columns=column_names)

Thank you!

allenai / cartography

dump the training dynamics to json #1