High usage of RAM - Githubissues

woctezuma commented 1 year ago

I am considering a problem with ~ 3k valid solutions and ~ 17k possible guesses.

I have tried running the code:

on my local machine,
on Google Colab.

In both cases, I am faced with high usage of RAM. I wonder if this is normal due to the method used, or if there is a memory leak in the code.

The script runs fine for now. Script

The RAM usage linearly increases. High RAM usage

High RAM usage ends up leading to a crash. RAM

The script has crashed. Script

For reference, this does not happen (or it does not have enough time to happen) if I consider a smaller problem where the possible guesses are forced to be valid solutions, i.e. with ~ 3k valid solutions and ~ 3k possible guesses.

Also, the tool TylerGlaiel/wordlebot does not have this issue with the large problem. It is a C++ program which uses different methods, vaguely explained in this blog post. There are 3 methods, and the Complex one is horrendously slow (but does not crash), and the two other methods (Simple and MinMax) are fine.

woctezuma commented 1 year ago

After I have looked at this function, the linear increase in RAM usage must be normal, but it is annoying. 😅

https://github.com/GillesVandewiele/Wordle-Bot/blob/f6334959105549745300ff2c1fc7f80e7c8d7a07/wordle.py#L39-L41

https://github.com/GillesVandewiele/Wordle-Bot/blob/f6334959105549745300ff2c1fc7f80e7c8d7a07/wordle.py#L49-L52

I wonder if something could be done about that.

woctezuma commented 1 year ago

I believe I can just chunk the list of candidates with How do I split a list into equally-sized chunks?.

def chunks(lst, n):
    """Yield successive n-sized chunks from lst."""
    for i in range(0, len(lst), n):
        yield lst[i:i + n]

woctezuma commented 1 year ago

Yes, it is much faster now, and it does not eat up all the RAM.

Loaded dictionary with 17080 words...

[1/4] Processing pattern_dict_1.p
100%|██████████| 5000/5000 [03:59<00:00, 20.88it/s]
  0%|          | 0/5000 [00:00<?, ?it/s]
[2/4] Processing pattern_dict_2.p
100%|██████████| 5000/5000 [03:43<00:00, 22.38it/s]
[3/4] Processing pattern_dict_3.p
100%|██████████| 5000/5000 [03:48<00:00, 21.90it/s]
  0%|          | 0/2080 [00:00<?, ?it/s]
[4/4] Processing pattern_dict_4.p
100%|██████████| 2080/2080 [00:39<00:00, 52.05it/s]
  0%|          | 0/3268 [00:00<?, ?it/s]

[1/4] Processing pattern_dict_1.p
[2/4] Processing pattern_dict_2.p
[3/4] Processing pattern_dict_3.p
[4/4] Processing pattern_dict_4.p

Guessing:      FEBOC
Info:          (0, 0, 2, 0, 2)

Edit: The guess FEBOC is normal in my case. The bot is applied to Dungleon instead of Wordle.

woctezuma commented 1 year ago

I was able to apply the algorithm to Dungleon (which has more allowed guesses than Wordle) without any crash due to a high usage of RAM, by using my fork.

However, this comes at a disadvantage: the script is really slow, most likely due to large files being loaded from the disk repeatedly!

If someone reads this and faces the same issue, I would suggest some (very fast) supplementary code to the video mentioned in your README:

3Blue1Brown, Solving Wordle using information theory, posted on Youtube on February 6, 2022,
3b1b/videos: supplementary code (in Python) accompanying the aforementioned video,

woctezuma commented 1 year ago

For reference, this is the result of the clean-up of the supplementary code: https://github.com/woctezuma/3b1b-wordle-solver

GillesVandewiele / Wordle-Bot

High usage of RAM #8