Open DidiD1 opened 4 months ago
Hello, do the file sizes look correct (e.g., training set should be ~144M)? If not, you might need to install git large file storage first and git clone again: https://docs.github.com/en/repositories/working-with-files/managing-large-files/installing-git-large-file-storage
Updated README.
We have also added a simple script to show how to retrieve the labels from the dataset at https://github.com/google-research/google-research/blob/master/richhf_18k/parse_tfrecord_file.py.
Great work! When i try to read the tfrecord data, some errors happened. It seems the tfrecord has been broken. When i use num_elements = tf.data.experimental.cardinality(record_iter).numpy() to check the nums, it shows 'Number of elements in dataset: -2' in the terminal. Could you release some scripts to help for read or update the tfrecord? Thanks for answer!!!