google-deepmind / pysc2

StarCraft II Learning Environment
Apache License 2.0
7.96k stars 1.15k forks source link

Replay dataset sizes for versions. #341

Closed rahatsantosh closed 1 year ago

rahatsantosh commented 2 years ago

I was looking for replay datasets, and am currently using version 4.10.0. The dataset I got using the Blizzard API has around 25,000 replays. Since I wanted a larger dataset, and as per my understanding, there is no compatibility between versions, so is there any way to know the size of the replay dataset in each version, in order to know which version to download from. Or is there any place where I can get this info from.

Thanks in advance.

tewalds commented 1 year ago

The sizes we've got are:

4.6.0 28G
4.6.1 14G
4.6.2 17G
4.7.0 28G
4.7.1 97G
4.8.0 94G
4.8.1 41G
4.8.2 64G
4.8.3 137G
4.8.4 43G
4.8.5 864M
4.8.6 88G
4.9.0 47G
4.9.1 43G
4.9.2 63G
4.9.3 102G
4.10.0 6.8G
4.10.1 50G
4.10.2 16G
4.10.3 85G
4.10.4 57G
4.11.0 934M
4.11.1 13G
4.11.2 5.8G
5.0.2 137G
5.0.4 142G

I'm not sure how that maps exactly to number of replays. I think this avoids any that are corrupt, but does include low level or non-sensical games.