CSSLab / maia-chess

Maia is a human-like neural network chess engine trained on millions of human games.
https://maiachess.com
GNU General Public License v3.0
963 stars 121 forks source link

TypeError: expected str, bytes or os.PathLike object, not list #55

Closed CallOn84 closed 1 year ago

CallOn84 commented 1 year ago

Hi.

While trying to start train_maia.py, it gave me this error:

Traceback (most recent call last):
  File "\move_prediction\train_maia.py", line 187, in <module>
    main(args.config, name, collection_name)
  File "\move_prediction\train_maia.py", line 29, in main
    train_chunks = get_latest_chunks(cfg['dataset']['input_train'])
  File "\move_prediction\train_maia.py", line 98, in get_latest_chunks
    maia_chess_backend.printWithDate(f"found {glob.glob(path)} chunk dirs")
  File "C:\Program Files\Python39\lib\glob.py", line 22, in glob
    return list(iglob(pathname, recursive=recursive))
  File "C:\Program Files\Python39\lib\glob.py", line 43, in _iglob
    dirname, basename = os.path.split(pathname)
  File "C:\Program Files\Python39\lib\ntpath.py", line 185, in split
    p = os.fspath(p)
TypeError: expected str, bytes or os.PathLike object, not list

For context, here is my config file:

%YAML 1.2
---
gpu: 0

dataset:
  num_chunks: 100000000
  allow_less_chunks: true
  train_ratio: 0.70
  input_train:
    - '\move_prediction\trainingdata\supervised-0\'
    - '\move_prediction\trainingdata\supervised-1\'
    - '\move_prediction\trainingdata\supervised-2\'
  input_test:
    - '\move_prediction\trainingdata\supervised-0\'
    - '\move_prediction\trainingdata\supervised-1\'
    - '\move_prediction\trainingdata\supervised-2\'
  train_workers: 32
  test_workers: 8

training:
    swa: true
    swa_output: true
    swa_steps: 100
    swa_max_n: 10
    mask_legal_moves: true
    lookahead_optimizer: true
    precision: 'half'
    batch_size: 1024
    num_batch_splits: 1
    test_steps: 2000
    num_test_positions: 131072
    train_avg_report_steps: 50
    total_steps: 400000
    checkpoint_steps: 10000
    shuffle_size: 250000
    lr_values:
        - 0.1
        - 0.01
        - 0.001
        - 0.0001
    lr_boundaries:
        - 80000
        - 200000
        - 360000
    policy_loss_weight: 1.0            # weight of policy loss
    value_loss_weight: 1.0             # weight of value loss
    moves_left_loss_weight: 0.1
    moves_left_gradient_flow: 1.0

model:
  filters: 64
  residual_blocks: 6
  se_ratio: 8
...

I don't understand why it's giving me this TypeError: expected str, bytes or os.PathLike object, not list error. Any help would be appreciated.

reidmcy commented 1 year ago

Your providing a list in the yaml, it's expecting a string

CallOn84 commented 1 year ago

Your providing a list in the yaml, it's expecting a string

The yaml for maia_config is in a list, so I don't get it.

reidmcy commented 1 year ago

Please check again, here this is the relevant line. I'm not sure which config you are using there

CallOn84 commented 1 year ago

Please check again, here this is the relevant line. I'm not sure which config you are using there

Ah, thanks. It works now. However, when it goes through the training games, it comes back as 0 chunks total. I don't know why that happening when there's the data there.

reidmcy commented 1 year ago

It's not looking for the the directory to the files, did you read the comment in the yaml?

CallOn84 commented 1 year ago

It's not looking for the the directory to the files, did you read the comment in the yaml?

I did read the comments of the YAML and followed it to the tea. Still detecting zero chunks.

reidmcy commented 1 year ago

Then you are putting in the wrong path, glob.glob() will produce [] when nothing is found

CallOn84 commented 1 year ago

Then you are putting in the wrong path, glob.glob() will produce [] when nothing is found

I've replicated the same path as the final config in the configuration file, and it's still giving me no chunks.

This is literally my config file:

%YAML 1.2
---
gpu: 0

dataset:
  input_train: '/trainingdata/elo_ranges/2500/train/*/*'
  input_test: '/trainingdata/elo_ranges/2500/test/*/*'

training:
    precision: 'half'
    batch_size: 1024
    num_batch_splits: 1
    test_steps: 2000
    train_avg_report_steps: 50
    total_steps: 400000
    checkpoint_steps: 10000
    shuffle_size: 250000
    lr_values:
        - 0.1
        - 0.01
        - 0.001
        - 0.0001
    lr_boundaries:
        - 80000
        - 200000
        - 360000
    policy_loss_weight: 1.0            # weight of policy loss
    value_loss_weight: 1.0             # weight of value loss

model:
  filters: 64
  residual_blocks: 6
  se_ratio: 8
...
reidmcy commented 1 year ago

You're on Windows, that's almost certainly not a valid path on your OS. Where are your datafiles? The paths should point to them. The code we provide is intended for replication, it is not meant to be a turnkey run code and it just works, it's intended for other researches to replicate our work.

CallOn84 commented 1 year ago

You're on Windows, that's almost certainly not a valid path on your OS. Where are your datafiles? The paths should point to them. The code we provide is intended for replication, it is not meant to be a turnkey run code and it just works, it's intended for other researches to replicate our work.

Ah, okay. That makes sense. I just assumed the training was similar to that of when you train Leela Chess Zero using supervised learning, which I have done before.

As for data files, I'm not sure what you mean by that.

reidmcy commented 1 year ago

Did you do steps 1 and 2 of the instructions? That's for generating the training data, those are converted to the training/validation data files in step 2.

CallOn84 commented 1 year ago

Did you do steps 1 and 2 of the instructions? That's for generating the training data, those are converted to the training/validation data files in step 2.

After fiddling about for hours on Windows, trying to get it to work, I got train_maia.py working! I'm going to start training Maia to be rated at around 2500.