CSSLab / maia-chess

Maia is a human-like neural network chess engine trained on millions of human games.
https://maiachess.com
GNU General Public License v3.0
964 stars 121 forks source link

How to convert the .gz files made by trainingdata-tool into tensors? #70

Closed EaswarGn closed 2 months ago

EaswarGn commented 2 months ago

I used trainindata-tool to convert many pgn games into an input format for lc0. The output was a bunch of .gz files, how would I convert these into tensors that I can use to train the model?

Thanks

reidmcy commented 2 months ago

There isn't a direct way to convert since the whole point of the .gz files is that they are fast to load during training. They contain the games as an array of vectors. The conversion is done in parallel during training since we do batched training. You can look at the function extract_inputs_outputs() (https://github.com/CSSLab/maia-chess/blob/master/move_prediction/train_maia.py#L139) as that is where the files are made into tensors for the conversion code, but again it's not a one to one conversion.

EaswarGn commented 2 months ago

I tried running that function on just one of the .gz files that was generated by trainingdata-tool. But I keep getting this error: Traceback (most recent call last): File "/Users/easwar/Downloads/inputs/test.py", line 69, in <module> decode_gz_file('supervised-0/game_000000.gz') File "/Users/easwar/Downloads/inputs/test.py", line 54, in decode_gz_file print(extract_inputs_outputs(combined_content)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/easwar/Downloads/inputs/test.py", line 19, in extract_inputs_outputs unit_planes = tf.cast(tf.tile(unit_planes, [1, 1, 8, 8]), tf.float32) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/anaconda3/lib/python3.12/site-packages/tensorflow/python/ops/gen_array_ops.py", line 12043, in tile return tile_eager_fallback( ^^^^^^^^^^^^^^^^^^^^ File "/opt/anaconda3/lib/python3.12/site-packages/tensorflow/python/ops/gen_array_ops.py", line 12089, in tile_eager_fallback _result = _execute.execute(b"Tile", 1, inputs=_inputs_flat, attrs=_attrs, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/anaconda3/lib/python3.12/site-packages/tensorflow/python/eager/execute.py", line 53, in quick_execute tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ tensorflow.python.framework.errors_impl.InvalidArgumentError: {{function_node __wrapped__Tile_device_/job:localhost/replica:0/task:0/device:CPU:0}} Expected multiples argument to be a vector of length 3 but got length 4 [Op:Tile]

is there anything that is wrong with my process?

reidmcy commented 2 months ago

The function doesn't take in a path. You will need to read the code, understand, and modify the code. There is no function that does what you want in the published code. That function is just the best starting point. If you just want to do fine tuning than look at the maia individual code release as that has fine tuning code, or you can just follow the readme on this repo.