yfeng997 / MadMario

Interactive tutorial to build a learning Mario, for first-time RL learners
192 stars 68 forks source link

Pretrained weight is really bad? #17

Open ratthachat opened 9 months ago

ratthachat commented 9 months ago

Hi, thanks for the amazing repo!

I download the trained weight here https://drive.google.com/file/d/1RRwhSMUrpBBRyAsfHLPGt1rlYFoiuus2/view?usp=sharing mentioned in README.

And then load statedict into Mario network successfully.

file_id = '1RRwhSMUrpBBRyAsfHLPGt1rlYFoiuus2'
url = f'https://drive.google.com/uc?id={file_id}'
!gdown {url} # I run in Colab

ckp = torch.load('./trained_mario.chkpt', map_location=('cuda' if use_cuda else 'cpu'))
mario.exploration_rate = ckp.get('exploration_rate')
mario.net.load_state_dict(ckp.get('model'))
<All keys matched successfully>

However, when trying to play using this trained model, the mario always dies very fast at the beginning (e.g. 40 frames) Is the above path still a correct pretrained path?

AIyumeng commented 3 months ago

I have the same problem with you. The agent always die at {'coins': 0, 'flag_get': False, 'life': 2, 'score': 200, 'stage': 1, 'status': 'small', 'time': 368, 'world': 1, 'x_pos': 898, 'y_pos': 79}

Additional: I do not have to exploration_rate,my policy is exactly stable. my code is '''mario.net.load_state_dict(torch.load('trained_mario.chkpt')['model'])'''