hendrycks / math

The MATH Dataset (NeurIPS 2021)
MIT License
879 stars 85 forks source link

File lists for mathematica #5

Closed crazysal closed 3 years ago

crazysal commented 3 years ago

For pre-training on amps where are the files "no_steps_flist_relative.txt" and "with_steps_flist_relative.txt" .

Are they the concatenation of all *.txt in folder "data_file_lists" ?

There is a missing swap file "make_flists.py.swp" in the mathematical root folder. If possible please share the same?

for eg. is this what is expected :

filenames = ['no_steps_flist_relative_algebra.txt', 'no_steps_flist_relative_calculus.txt', 'no_steps_flist_relative_counting_and_statistics.txt', 'no_steps_flist_relative_geometry.txt', 'no_steps_flist_relative_linear_algebra.txt', 'no_steps_flist_relative_number_theory.txt']
with open('./no_steps_flist_relative.txt', 'w') as outfile:
    for fname in filenames:
        with open(fname) as infile:
            for line in infile:
                outfile.write(line)
hendrycks commented 3 years ago

make_flists.py.swp

Swap files should not be necessary to run things.

If the files are missing from the tar, then feel free to ignore them, as some filenames were renamed.